David Rowley [Tue, 21 Oct 2025 07:46:49 +0000 (20:46 +1300)]
Fix BRIN 32-bit counter wrap issue with huge tables
A BlockNumber (32-bit) might not be large enough to add bo_pagesPerRange
to when the table contains close to 2^32 pages. At worst, this could
result in a cancellable infinite loop during the BRIN index scan with
power-of-2 pagesPerRange, and slow (inefficient) BRIN index scans and
scanning of unneeded heap blocks for non power-of-2 pagesPerRange.
Backpatch to all supported versions.
Author: sunil s <sunilfeb26@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAOG6S4-tGksTQhVzJM19NzLYAHusXsK2HmADPZzGQcfZABsvpA@mail.gmail.com
Backpatch-through: 13
I mistakenly included the "Replaces" lines describing the origin of
Result nodes in groupingsets.out, which actually come from a feature
not available in v18. Mea culpa.
Richard Guo [Tue, 21 Oct 2025 03:35:36 +0000 (12:35 +0900)]
Fix pushdown of degenerate HAVING clauses
67a54b9e8 taught the planner to push down HAVING clauses even when
grouping sets are present, as long as the clause does not reference
any columns that are nullable by the grouping sets. However, there
was an oversight: if any empty grouping sets are present, the
aggregation node can produce a row that did not come from the input,
and pushing down a HAVING clause in this case may cause us to fail to
filter out that row.
Currently, non-degenerate HAVING clauses are not pushed down when
empty grouping sets are present, since the empty grouping sets would
nullify the vars they reference. However, degenerate (variable-free)
HAVING clauses are not subject to this restriction and may be
incorrectly pushed down.
To fix, explicitly check for the presence of empty grouping sets and
retain degenerate clauses in HAVING when they are present. This
ensures that we don't emit a bogus aggregated row. A copy of each
such clause is also put in WHERE so that query_planner() can use it in
a gating Result node.
To facilitate this check, this patch expands the groupingSets tree of
the query to a flat list of grouping sets before applying the HAVING
pushdown optimization. This does not add any additional planning
overhead, since we need to do this expansion anyway.
In passing, make a small tweak to preprocess_grouping_sets() by
reordering its initial operations a bit.
Backpatch to v18, where this issue was introduced.
Reported-by: Yuhang Qiu <iamqyh@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0879D9C9-7FE2-4A20-9593-B23F7A0B5290@gmail.com
Backpatch-through: 18
Michael Paquier [Mon, 20 Oct 2025 23:08:25 +0000 (08:08 +0900)]
Fix POSIX compliance in pgwin32_unsetenv() for "name" argument
pgwin32_unsetenv() (compatibility routine of unsetenv() on Windows)
lacks the input validation that its sibling pgwin32_setenv() has.
Without these checks, calling unsetenv() with incorrect names crashes on
WIN32. However, invalid names should be handled, failing on EINVAL.
This commit adds the same checks as setenv() to fail with EINVAL for a
"name" set to NULL, an empty string, or if '=' is included in the value,
per POSIX requirements.
Like 7ca37fb0406b, backpatch down to v14. pgwin32_unsetenv() is defined
on REL_13_STABLE, but with the branch going EOL soon and the lack of
setenv() there for WIN32, nothing is done for v13.
Author: Bryan Green <dbryan.green@gmail.com>
Discussion: https://postgr.es/m/b6a1e52b-d808-4df7-87f7-2ff48d15003e@gmail.com
Backpatch-through: 14
Tom Lane [Mon, 20 Oct 2025 14:36:41 +0000 (10:36 -0400)]
Add .abi-compliance-history to v18 branch.
This is just a quick commit to verify that the buildfarm ABI
checker responds to this control file. Once we've tested
that, we will update the file to point at c8af5019b.
There's documentation mop-up yet to do, too.
Discussion: https://postgr.es/m/aPJ03E2itovDBcKX@nathan
Backpatch-through: 18 only
The revised logic in 001_ssltests.pl would fail if openssl
doesn't work or if Perl is a 32-bit build, because it had
already overwritten $serialno with something inappropriate
to use in the eventual match. We could go back to the
previous code layout, but it seems best to introduce a
separate variable for the output of openssl.
Per failure on buildfarm member mamba, which has a 32-bit Perl.
Tom Lane [Sun, 19 Oct 2025 18:36:58 +0000 (14:36 -0400)]
Don't rely on zlib's gzgetc() macro.
It emerges that zlib's configuration logic is not robust enough
to guarantee that the macro will have the same ideas about struct
field layout as the library itself does, leading to corruption of
zlib's state struct followed by unintelligible failure messages.
This hazard has existed for a long time, but we'd not noticed
for several reasons:
(1) We only use gzgetc() when trying to read a manually-compressed
TOC file within a directory-format dump, which is a rarely-used
scenario that we weren't even testing before 20ec99589.
(2) No corruption actually occurs unless sizeof(long) is different
from sizeof(off_t) and the platform is big-endian.
(3) Some platforms have already fixed the configuration instability,
at least sufficiently for their environments.
Despite (3), it seems foolish to assume that the problem isn't
going to be present in some environments for a long time to come.
Hence, avoid relying on this macro. We can just #undef it and
fall back on the underlying function of the same name.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2122679.1760846783@sss.pgh.pa.us
Backpatch-through: 13
Álvaro Herrera [Sat, 18 Oct 2025 16:18:19 +0000 (18:18 +0200)]
Fix determination of not-null constraint "locality" for inherited columns
It is possible to have a non-inherited not-null constraint on an
inherited column, but we were failing to preserve such constraints
during pg_upgrade where the source is 17 or older, because of a bug in
the pg_dump query for it. Oversight in commit 14e87ffa5c54. Fix that
query. In passing, touch-up a bogus nearby comment introduced by the
same commit.
In version 17, make the regression tests leave a table in this
situation, so that this scenario is tested in the cross-version upgrade
tests of 18 and up.
Álvaro Herrera [Sat, 18 Oct 2025 15:50:10 +0000 (17:50 +0200)]
Fix pg_dump sorting of foreign key constraints
Apparently, commit 04bc2c42f765 failed to notice that DO_FK_CONSTRAINT
objects require identical handling as DO_CONSTRAINT ones, which causes
some pg_upgrade tests in debug builds to fail spuriously. Add that.
David Rowley [Sat, 18 Oct 2025 03:07:41 +0000 (16:07 +1300)]
Fix reset of incorrect hash iterator in GROUPING SETS queries
This fixes an unlikely issue when fetching GROUPING SET results from
their internally stored hash tables. It was possible in rare cases that
the hash iterator would be set up incorrectly which could result in a
crash.
This was introduced in 4d143509c, so backpatch to v18.
Many thanks to Yuri Zamyatin for reporting and helping to debug this
issue.
Bug: #19078 Reported-by: Yuri Zamyatin <yuri@yrz.am>
Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org
Backpatch-through: 18
Tomas Vondra [Fri, 17 Oct 2025 19:44:42 +0000 (21:44 +0200)]
Fix hashjoin memory balancing logic
Commit a1b4f289beec improved the hashjoin sizing to also consider the
memory used by BufFiles for batches. The code however had multiple
issues, making it ineffective or not working as expected in some cases.
* The amount of memory needed by buffers was calculated using uint32,
so it would overflow for nbatch >= 262144. If this happened the loop
would exit prematurely and the memory usage would not be reduced.
The nbatch overflow is fixed by reworking the condition to not use a
multiplication at all, so there's no risk of overflow. An explicit
cast was added to a similar calculation in ExecHashIncreaseBatchSize.
* The loop adjusting the nbatch value used hash_table_bytes to calculate
the old/new size, but then updated only space_allowed. The consequence
is the total memory usage was not reduced, but all the memory saved by
reducing the number of batches was used for the internal hash table.
This was fixed by using only space_allowed. This is also more correct,
because hash_table_bytes does not account for skew buckets.
* The code was also doubling multiple parameters (e.g. the number of
buckets for hash table), but was missing overflow protections.
The loop now checks for overflow, and terminates if needed. It'd be
possible to cap the value and continue the loop, but it's not worth
the complexity. And the overflow implies the in-memory hash table is
already very large anyway.
While at it, rework the comment explaining how the memory balancing
works, to make it more concise and easier to understand.
The initial nbatch overflow issue was reported by Vaibhav Jain. The
other issues were noticed by me and Melanie Plageman. Fix by me, with a
lot of review and feedback by Melanie.
Backpatch to 18, where the hashjoin memory balancing was introduced.
Nathan Bossart [Fri, 17 Oct 2025 16:36:50 +0000 (11:36 -0500)]
Fix privilege checks for pg_prewarm() on indexes.
pg_prewarm() currently checks for SELECT privileges on the target
relation. However, indexes do not have access rights of their own,
so a role may be denied permission to prewarm an index despite
having the SELECT privilege on its parent table. This commit fixes
this by locking the parent table before the index (to avoid
deadlocks) and checking for SELECT on the parent table. Note that
the code is largely borrowed from
amcheck_lock_relation_and_check().
An obvious downside of this change is the extra AccessShareLock on
the parent table during prewarming, but that isn't expected to
cause too much trouble in practice.
Author: Ayush Vatsa <ayushvatsa1810@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/CACX%2BKaMz2ZoOojh0nQ6QNBYx8Ak1Dkoko%3DD4FSb80BYW%2Bo8CHQ%40mail.gmail.com
Backpatch-through: 13
Avoid warnings in tests when openssl binary isn't available
The SSL tests for pg_stat_ssl tries to exactly match the serial
from the certificate by extracting it with the openssl binary.
If that fails due to the binary not being available, a fallback
match is used, but the attempt to execute a missing binary adds
a warning to the output which can confuse readers for a failure
in the test. Fix by only attempting if the openssl binary was
found by autoconf/meson.
Backpatch down to v16 where commit c8e4030d1bdd made the test
use the OPENSSL variable from autoconf/meson instead of a hard-
coded value.
Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Christoph Berg <myon@debian.org>
Discussion: https://postgr.es/m/aNPSp1-RIAs3skZm@msg.df7cb.de
Backpatch-through: 16
Michael Paquier [Fri, 17 Oct 2025 04:06:09 +0000 (13:06 +0900)]
Fix matching check in recovery test 042_low_level_backup
042_low_level_backup compared the result of a query two times with a
comparison operator based on an integer, while the result should be
compared with a string.
The outcome of the tests is currently not impacted by this change.
However, it could be possible that the tests fail to detect future
issues if the query results become different, for some reason.
Michael Paquier [Fri, 17 Oct 2025 04:01:17 +0000 (13:01 +0900)]
pg_createsubscriber: Fix matching check in TAP test
040_pg_createsubscriber has been calling safe_psql(), that returns the
result of a SQL query, with ok() without checking the result generated
(in this case 't', for a number of publications).
The outcome of the tests is currently not impacted by this change.
However, it could be possible that the test fails to detect future
issues if the query results become different.
The test is rewritten so as the number of publications is checked. This
is not the fix suggested originally by the author, but this is more
reliable in the long run.
Álvaro Herrera [Thu, 16 Oct 2025 18:21:05 +0000 (20:21 +0200)]
Fix update-po for the PGXS case
The original formulation failed to take into account the fact that for
the PGXS case, the source dir is not $(top_srcdir), so it ended up not
doing anything. Handle it explicitly.
Amit Langote [Thu, 16 Oct 2025 05:01:50 +0000 (14:01 +0900)]
Fix EPQ crash from missing partition directory in EState
EvalPlanQualStart() failed to propagate es_partition_directory into
the child EState used for EPQ rechecks. When execution time partition
pruning ran during the EPQ scan, executor code dereferenced a NULL
partition directory and crashed.
Previously, propagating es_partition_directory into the EPQ EState was
unnecessary because CreatePartitionPruneState(), which sets it on
demand, also initialized the exec-pruning context. After commit d47cbf474, CreatePartitionPruneState() now initializes only the init-
time pruning context, leaving exec-pruning context initialization to
ExecInitNode(). Since EvalPlanQualStart() runs only ExecInitNode() and
not CreatePartitionPruneState(), it can encounter a NULL
es_partition_directory. Other executor fields initialized during
CreatePartitionPruneState() are already copied into the child EState
thanks to commit 8741e48e5d, but es_partition_directory was missed.
Fix by borrowing the parent estate's es_partition_directory in
EvalPlanQualStart(), and by clearing that field in EvalPlanQualEnd()
so the parent remains responsible for freeing the directory.
Add an isolation test permutation that triggers EPQ with execution-
time partition pruning, the case that reproduces this crash.
Bug: #19078 Reported-by: Yuri Zamyatin <yuri@yrz.am> Diagnosed-by: David Rowley <dgrowleyml@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org
Backpatch-through: 18
Nathan Bossart [Wed, 15 Oct 2025 17:47:33 +0000 (12:47 -0500)]
Fix lookups in pg_{clear,restore}_{attribute,relation}_stats().
Presently, these functions look up the relation's OID, lock it, and
then check privileges. Not only does this approach provide no
guarantee that the locked relation matches the arguments of the
lookup, but it also allows users to briefly lock relations for
which they do not have privileges, which might enable
denial-of-service attacks. This commit adjusts these functions to
use RangeVarGetRelidExtended(), which is purpose-built to avoid
both of these issues. The new RangeVarGetRelidCallback function is
somewhat complicated because it must handle both tables and
indexes, and for indexes, we must check privileges on the parent
table and lock it first. Also, it needs to handle a couple of
extremely unlikely race conditions involving concurrent OID reuse.
A downside of this change is that the coding doesn't allow for
locking indexes in AccessShare mode anymore; everything is locked
in ShareUpdateExclusive mode. Per discussion, the original choice
of lock levels was intended for a now defunct implementation that
used in-place updates, so we believe this change is okay.
Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/Z8zwVmGzXyDdkAXj%40nathan
Backpatch-through: 18
Etsuro Fujita [Wed, 15 Oct 2025 08:15:01 +0000 (17:15 +0900)]
Fix EvalPlanQual handling of foreign/custom joins in ExecScanFetch.
If inside an EPQ recheck, ExecScanFetch would run the recheck method
function for foreign/custom joins even if they aren't descendant nodes
in the EPQ recheck plan tree, which is problematic at least in the
foreign-join case, because such a foreign join isn't guaranteed to have
an alternative local-join plan required for running the recheck method
function; in the postgres_fdw case this could lead to a segmentation
fault or an assert failure in an assert-enabled build when running the
recheck method function.
Even if inside an EPQ recheck, any scan nodes that aren't descendant
ones in the EPQ recheck plan tree should be normally processed by using
the access method function; fix by modifying ExecScanFetch so that if
inside an EPQ recheck, it runs the recheck method function for
foreign/custom joins that are descendant nodes in the EPQ recheck plan
tree as before and runs the access method function for foreign/custom
joins that aren't.
This fix also adds to postgres_fdw an isolation test for an EPQ recheck
that caused issues stated above.
Michael Paquier [Mon, 13 Oct 2025 23:31:24 +0000 (08:31 +0900)]
Fix version number calculation for data folder flush in pg_combinebackup
The version number calculated by read_pg_version_file() is multiplied
once by 10000, to be able to do comparisons based on PG_VERSION_NUM or
equivalents with a minor version included. However, the version number
given sync_pgdata() was multiplied by 10000 a second time, leading to an
overestimated number.
This issue was harmless (still incorrect) as pg_combinebackup does not
support versions of Postgres older than v10, and sync_pgdata() only
includes a version check due to the rename of pg_xlog/ to pg_wal/. This
folder rename happened in the development cycle of v10. This would
become a problem if in the future sync_pgdata() is changed to have more
version-specific checks.
Oversight in dc212340058b, so backpatch down to v17.
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aOil5d0y87ZM_wsZ@paquier.xyz
Backpatch-through: 17
Tom Lane [Mon, 13 Oct 2025 21:56:45 +0000 (17:56 -0400)]
Fix incorrect message-printing in win32security.c.
log_error() would probably fail completely if used, and would
certainly print garbage for anything that needed to be interpolated
into the message, because it was failing to use the correct printing
subroutine for a va_list argument.
This bug likely went undetected because the error cases this code
is used for are rarely exercised - they only occur when Windows
security API calls fail catastrophically (out of memory, security
subsystem corruption, etc).
The FRONTEND variant can be fixed just by calling vfprintf()
instead of fprintf(). However, there was no va_list variant
of write_stderr(), so create one by refactoring that function.
Following the usual naming convention for such things, call
it vwrite_stderr().
Author: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=TzaM+tWPnw@mail.gmail.com
Backpatch-through: 13
David Rowley [Mon, 13 Oct 2025 20:25:34 +0000 (09:25 +1300)]
Doc: clarify n_distinct_inherited setting
There was some confusion around how to adjust the n_distinct estimates
for partitioned tables. Here we try and clarify that
n_distinct_inherited needs to be adjusted rather than n_distinct.
Also fix some slightly misleading text which was talking about table
size rather than table rows, fix a grammatical error, and adjust some
text which indicated that ANALYZE was performing calculations based on
the n_distinct settings. Really it's the query planner that does this
and ANALYZE only stores the overridden n_distinct estimate value in
pg_statistic.
Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Backpatch-through: 13
Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com
Tom Lane [Mon, 13 Oct 2025 16:44:20 +0000 (12:44 -0400)]
Fix issue with reading zero bytes in Gzip_read.
pg_dump expects a read request of zero bytes to be a no-op; see for
example ReadStr(). Gzip_read got this wrong and falsely supposed
that the resulting gzret == 0 indicated an error. We could complicate
that error-checking logic some more, but it seems best to just fall
out immediately when passed size == 0.
This bug breaks the nominally-supported case of manually gzip'ing
the toc.dat file within a directory-style dump, so back-patch to v16
where this code came in. (Prior branches already have a short-circuit
for size == 0 before their only gzread call.)
Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us
Backpatch-through: 16
Magnus Hagander [Mon, 13 Oct 2025 13:31:25 +0000 (15:31 +0200)]
docs: Fix protocol version 3.2 message format of CancelRequest
Since protocol version 3.2 the CancelRequest does not have a fixed size
length anymore. The protocol docs still listed the length field to be a
constant number though. This fixes that.
Peter Geoghegan [Sun, 12 Oct 2025 18:04:06 +0000 (14:04 -0400)]
Remove unused nbtree array advancement variable.
Remove a variable that is no longer in use following commit 9a2e2a28.
It's not immediately clear why there were no compiler warnings about
this oversight.
Author: Peter Geoghegan <pg@bowt.ie>
Backpatch-through: 18
Tom Lane [Sat, 11 Oct 2025 20:33:55 +0000 (16:33 -0400)]
Restore test coverage of LZ4Stream_gets().
In commit a45c78e32 I removed the only regression test case that
reaches this function, because it turns out that we only use it
if reading an LZ4-compressed blobs.toc file in a directory dump,
and that is a state that has to be created manually. That seems
like a bad thing to not test, not so much for LZ4Stream_gets()
itself as because it means the squirrely eol_flag logic in
LZ4Stream_read_internal() is not tested.
The reason for the change was that I thought the lz4 program did not
have any way to perform compression without explicit specification
of the output file name. However, it turns out that the syntax
synopsis in its man page is a lie, and if you read enough of the
man page you find out that with "-m" it will do what's needful.
So restore the manual compression step in that test case.
Noted while testing some proposed changes in pg_dump's compression
logic.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us
Backpatch-through: 17
Álvaro Herrera [Sat, 11 Oct 2025 18:30:12 +0000 (20:30 +0200)]
Stop creating constraints during DETACH CONCURRENTLY
Commit 71f4c8c6f74b (which implemented DETACH CONCURRENTLY) added code
to create a separate table constraint when a table is detached
concurrently, identical to the partition constraint, on the theory that
such a constraint was needed in case the optimizer had constructed any
query plans that depended on the constraint being there. However, that
theory was apparently bogus because any such plans would be invalidated.
For hash partitioning, those constraints are problematic, because their
expressions reference the OID of the parent partitioned table, to which
the detached table is no longer related; this causes all sorts of
problems (such as inability of restoring a pg_dump of that table, and
the table no longer working properly if the partitioned table is later
dropped).
We'd like to get rid of all those constraints. In fact, for branch
master, do that -- no longer create any substitute constraints.
However, out of fear that some users might somehow depend on these
constraints for other partitioning strategies, for stable branches
(back to 14, which added DETACH CONCURRENTLY), only do it for hash
partitioning.
(If you repeatedly DETACH CONCURRENTLY and then ATTACH a partition, then
with this constraint addition you don't need to scan the table in the
ATTACH step, which presumably is good. But if users really valued this
feature, they would have requested that it worked for non-concurrent
DETACH also.)
Author: Haiyang Li <mohen.lhy@alibaba-inc.com> Reported-by: Fei Changhong <feichanghong@qq.com> Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/18371-7fef49f63de13f02@postgresql.org
Discussion: https://postgr.es/m/19070-781326347ade7c57@postgresql.org
Álvaro Herrera [Sat, 11 Oct 2025 14:39:22 +0000 (16:39 +0200)]
dbase_redo: Fix Valgrind-reported memory leak
Introduced by my (Álvaro's) commit 9e4f914b5eba, which was itself
backpatched to pg10, though only pg15 and up contain the problem
because of commit 9c08aea6a309.
This isn't a particularly significant leak, but given the fix is
trivial, we might as well backpatch to all branches where it applies, so
do that.
Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/x4odfdlrwvsjawscnqsqjpofvauxslw7b4oyvxgt5owoyf4ysn@heafjusodrz7
Peter Geoghegan [Fri, 10 Oct 2025 18:52:23 +0000 (14:52 -0400)]
Remove overzealous _bt_killitems assertion.
An assertion in _bt_killitems expected the scan's currPos state to
contain a valid LSN, saved from when currPos's page was initially read.
The assertion failed to account for the fact that even logged relations
can have leaf pages with an invalid LSN when built with wal_level set to
"minimal". Remove the faulty assertion.
Oversight in commit e6eed40e (though note that the assertion was
backpatched to stable branches before 18 by commit 7c319f54).
Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Matthijs van der Vleuten <postgresql@zr40.nl>
Bug: #19082
Discussion: https://postgr.es/m/19082-628e62160dbbc1c1@postgresql.org
Backpatch-through: 13
Michael Paquier [Fri, 10 Oct 2025 00:24:48 +0000 (09:24 +0900)]
Remove state.tmp when failing to save a replication slot
An error happening while a slot data is saved on disk in
SaveSlotToPath() could cause a state.tmp file (temporary file holding
the slot state data, renamed to its permanent name at the end of the
function) to remain around after it has been created. This temporary
file is created with O_EXCL, meaning that if an existing state.tmp is
found, its creation would fail. This would prevent the slot data to be
saved, requiring a manual intervention to remove state.tmp before being
able to save again a slot. Possible scenarios where this temporary file
could remain on disk is for example a ENOSPC case (no disk space) while
writing, syncing or renaming it. The bug reports point to a write
failure as the principal cause of the problems.
Using O_TRUNC has been argued back in 2019 as a potential solution to
discard any temporary file that could exist. This solution was rejected
as O_EXCL can also act as a safety measure when saving the slot state,
crash recovery offering cleanup guarantees post-crash. This commit uses
the alternative approach that has been suggested by Andres Freund back
in 2019. When the temporary state file cannot be written, synced,
closed or renamed (note: not when created!), an unlink() is used to
remove the temporary state file while holding the in-progress I/O
LWLock, so as any follow-up attempts to save a slot's data would not
choke on an existing file that remained around because of a previous
failure.
This problem has been reported a few times across the years, going back
to 2019, but for some reason I have never come back to do something
about it and it has been forgotten. A recent report has reminded me
that this was still a problem.
Reported-by: Kevin K Biju <kevinkbiju@gmail.com> Reported-by: Sergei Kornilov <sk@zsrv.org> Reported-by: Grigory Smolkin <g.smolkin@postgrespro.ru>
Discussion: https://postgr.es/m/CAM45KeHa32soKL_G8Vk38CWvTBeOOXcsxAPAs7Jt7yPRf2mbVA@mail.gmail.com
Discussion: https://postgr.es/m/3559061693910326@qy4q4a6esb2lebnz.sas.yp-c.yandex.net
Discussion: https://postgr.es/m/08bbfab1-a61d-3750-fc18-4ab2c1aa7f09@postgrespro.ru
Backpatch-through: 13
Masahiko Sawada [Thu, 9 Oct 2025 17:59:29 +0000 (10:59 -0700)]
Fix access-to-already-freed-memory issue in pgoutput.
While pgoutput caches relation synchronization information in
RelationSyncCache that resides in CacheMemoryContext, each entry's
information (such as row filter expressions and column lists) is
stored in the entry's private memory context (entry_cxt in
RelationSyncEntry), which is a descendant memory context of the
decoding context. If a logical decoding invoked via SQL functions like
pg_logical_slot_get_binary_changes fails with an error, subsequent
logical decoding executions could access already-freed memory of the
entry's cache, resulting in a crash.
With this change, it's ensured that RelationSyncCache is cleaned up
even in error cases by using a memory context reset callback function.
Backpatch to 15, where entry_cxt was introduced for column filtering
and row filtering.
While the backbranches v13 and v14 have a similar issue where
RelationSyncCache persists even after an error when pgoutput is used
via SQL API, we decided not to backport this fix. This decision was
made because v13 is approaching its final minor release, and we won't
have an chance to fix any new issues that might arise. Additionally,
since using pgoutput via SQL API is not a common use case, the risk
outwights the benefit. If we receive bug reports, we can consider
backporting the fixes then.
Amit Langote [Thu, 9 Oct 2025 05:07:52 +0000 (01:07 -0400)]
Fix internal error from CollateExpr in SQL/JSON DEFAULT expressions
SQL/JSON functions such as JSON_VALUE could fail with "unrecognized
node type" errors when a DEFAULT clause contained an explicit COLLATE
expression. That happened because assign_collations_walker() could
invoke exprSetCollation() on a JsonBehavior expression whose DEFAULT
still contained a CollateExpr, which exprSetCollation() does not
handle.
For example:
SELECT JSON_VALUE('{"a":1}', '$.c' RETURNING text
DEFAULT 'A' COLLATE "C" ON EMPTY);
Fix by validating in transformJsonBehavior() that the DEFAULT
expression's collation matches the enclosing JSON expression’s
collation. In exprSetCollation(), replace the recursive call on the
JsonBehavior expression with an assertion that its collation already
matches the target, since the parser now enforces that condition.
Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxHVwYYSyiVQ6o+PsRX6zQ7rAFinh_fv1kCfTsT1xG4Zeg@mail.gmail.com
Backpatch-through: 17
Tom Lane [Sun, 5 Oct 2025 20:27:47 +0000 (16:27 -0400)]
Use SOCK_ERRNO[_SET] in fe-secure-gssapi.c.
On Windows, this code did not handle error conditions correctly at
all, since it looked at "errno" which is not used for socket-related
errors on that platform. This resulted, for example, in failure
to connect to a PostgreSQL server with GSSAPI enabled.
We have a convention for dealing with this within libpq, which is to
use SOCK_ERRNO and SOCK_ERRNO_SET rather than touching errno directly;
but the GSSAPI code is a relative latecomer and did not get that memo.
(The equivalent backend code continues to use errno, because the
backend does this differently. Maybe libpq's approach should be
rethought someday.)
Apparently nobody tries to build libpq with GSSAPI support on Windows,
or we'd have heard about this before, because it's been broken all
along. Back-patch to all supported branches.
Author: Ning Wu <ning94803@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFGqpvg-pRw=cdsUpKYfwY6D3d-m9tw8WMcAEE7HHWfm-oYWvw@mail.gmail.com
Backpatch-through: 13
John Naylor [Fri, 3 Oct 2025 09:05:37 +0000 (16:05 +0700)]
Fix reuse-after-free hazard in dead_items_reset
In similar vein to commit ccc8194e427, a reset instance of a shared
memory TID store happened to occupy the same private memory as the old
one for the entry point, since the chunk freed after the last round
of index vacuuming was put on the context's freelist. The failure
to update the vacrel->dead_items pointer was evident by nudging the
system to allocate memory in a different area. This was not discovered
at the time of the earlier commit since our regression tests didn't
cover multiple index passes with parallel vacuum.
Backpatch to v17, when TidStore came in.
Author: Kevin Oommen Anish <kevin.o@zohocorp.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Tested-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/199a07cbdfc.7a1c4aac25838.1675074408277594551%40zohocorp.com
Backpatch-through: 17
Michael Paquier [Fri, 3 Oct 2025 05:04:00 +0000 (14:04 +0900)]
pgbench: Fail cleanly when finding a COPY result state
Currently, pgbench aborts when a COPY response is received in
readCommandResponse(). However, as PQgetResult() returns an empty
result when there is no asynchronous result, through getCopyResult(),
the logic done at the end of readCommandResponse() for the error path
leads to an infinite loop.
This commit forcefully exits the COPY state with PQendcopy() before
moving to the error handler when fiding a COPY state, avoiding the
infinite loop. The COPY protocol is not supported by pgbench anyway, as
an error is assumed in this case, so giving up is better than having the
tool be stuck forever. pgbench was interruptible in this state.
A TAP test is added to check that an error happens if trying to use
COPY.
Michael Paquier [Thu, 2 Oct 2025 02:09:10 +0000 (11:09 +0900)]
pgstattuple: Improve reports generated for indexes (hash, gist, btree)
pgstattuple checks the state of the pages retrieved for gist and hash
using some check functions from each index AM, respectively
gistcheckpage() and _hash_checkpage(). When these are called, they
would fail when bumping on data that is found as incorrect (like opaque
area size not matching, or empty pages), contrary to btree that simply
discards these cases and continues to aggregate data.
Zero pages can happen after a crash, with these AMs being able to do an
internal cleanup when these are seen. Also, sporadic failures are
annoying when doing for example a large-scale diagnostic query based on
pgstattuple with a join of pg_class, as it forces one to use tricks like
quals to discard hash or gist indexes, or use a PL wrapper able to catch
errors.
This commit changes the reports generated for btree, gist and hash to
be more user-friendly;
- When seeing an empty page, report it as free space. This new rule
applies to gist and hash, and already applied to btree.
- For btree, a check based on the size of BTPageOpaqueData is added.
- For gist indexes, gistcheckpage() is not called anymore, replaced by a
check based on the size of GISTPageOpaqueData.
- For hash indexes, instead of _hash_getbuf_with_strategy(), use a
direct call to ReadBufferExtended(), coupled with a check based on
HashPageOpaqueData. The opaque area size check was already used.
- Pages that do not match these criterias are discarded from the stats
reports generated.
There have been a couple of bug reports over the years that complained
about the current behavior for hash and gist, as being not that useful,
with nothing being done about it. Hence this change is backpatched down
to v13.
Jacob Champion [Wed, 1 Oct 2025 15:14:23 +0000 (08:14 -0700)]
test_json_parser: Speed up 002_inline.pl
Some macOS machines are having trouble with 002_inline, which executes
the JSON parser test executables hundreds of times in a nested loop.
Both developer machines and buildfarm critters have shown excessive test
durations, upwards of 20 seconds.
Push the innermost loop of 002_inline, which iterates through differing
chunk sizes, down into the test executable. (I'd eventually like to push
all of the JSON unit tests down into C, but this is an easy win in the
short term.) Testers have reported a speedup between 4-9x.
Reported-by: Robert Haas <robertmhaas@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Tested-by: Andrew Dunstan <andrew@dunslane.net> Tested-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BTgmobKoG%2BgKzH9qB7uE4MFo-z1hn7UngqAe9b0UqNbn3_XGQ%40mail.gmail.com
Backpatch-through: 17
pgbench: Fix error reporting in readCommandResponse().
pgbench uses readCommandResponse() to process server responses.
When readCommandResponse() encounters an error during a call to
PQgetResult() to fetch the current result, it attempts to report it
with an additional error message from PQerrorMessage(). However,
previously, this extra error message could be lost or become incorrect.
The cause was that after fetching the current result (and detecting
an error), readCommandResponse() called PQgetResult() again to
peek at the next result. This second call could overwrite the libpq
connection's error message before the original error was reported,
causing the error message retrieved from PQerrorMessage() to be
lost or overwritten.
This commit fixes the issue by updating readCommandResponse()
to use PQresultErrorMessage() instead of PQerrorMessage()
to retrieve the error message generated when the PQgetResult()
for the current result causes an error, ensuring the correct message
is reported.
Michael Paquier [Tue, 30 Sep 2025 00:02:35 +0000 (09:02 +0900)]
injection_points: Add proper locking when reporting fixed-variable stats
Contrary to its siblings for the archiver, the bgwriter and the
checkpointer stats, pgstat_report_inj_fixed() can be called
concurrently. This was causing an assertion failure, while messing up
with the stats.
This code is aimed at being a template for extension developers, so it
is not a critical issue, but let's be correct. This module has also
been useful for some benchmarking, at least for me, and that was how I
have discovered this issue.
Neighbor get_statistics_object_oid() ignores objects in pg_temp, as has
been the standard for non-relation, non-type namespace searches since
CVE-2007-2138. Hence, most operations that name a statistics object
correctly decline to map an unqualified name to a statistics object in
pg_temp. StatisticsObjIsVisibleExt() did not. Consequently,
pg_statistics_obj_is_visible() wrongly returned true for such objects,
psql \dX wrongly listed them, and getObjectDescription()-based ereport()
and pg_describe_object() wrongly omitted namespace qualification. Any
malfunction beyond that would depend on how a human or application acts
on those wrong indications. Commit d99d58cdc8c0b5b50ee92995e8575c100b1a458a introduced this. Back-patch to
v13 (all supported versions).
Tom Lane [Sat, 27 Sep 2025 18:29:41 +0000 (14:29 -0400)]
Fix missed copying of groupDistinct in transformPLAssignStmt.
Because we failed to do this, DISTINCT in GROUP BY DISTINCT would be
ignored in PL/pgSQL assignment statements. It's not surprising that
no one noticed, since such statements will throw an error if the query
produces more than one row. That eliminates most scenarios where
advanced forms of GROUP BY could be useful, and indeed makes it hard
even to find a simple test case. Nonetheless it's wrong.
This is directly the fault of be45be9c3 which added the groupDistinct
field, but I think much of the blame has to fall on c9d529848, in
which I incautiously supposed that we'd manage to keep two copies of
a big chunk of parse-analysis logic in sync. As a follow-up, I plan
to refactor so that there's only one copy. But that seems useful
only in master, so let's use this one-line fix for the back branches.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/31027.1758919078@sss.pgh.pa.us
Backpatch-through: 14
pgbench: Fix assertion failure with retriable errors in pipeline mode.
When running pgbench with --verbose-errors option and a custom script that
triggered retriable errors (e.g., serialization errors) in pipeline mode,
an assertion failure could occur:
Assertion failed: (sql_script[st->use_file].commands[st->command]->type == 1), function commandError, file pgbench.c, line 3062.
The failure happened because pgbench assumed these errors would only occur
during SQL commands, but in pipeline mode they can also happen during
\endpipeline meta command.
This commit fixes the assertion failure by adjusting the assertion check to
allow such errors during either SQL commands or \endpipeline.
Backpatch to v15, where the assertion check was introduced.
Tom Lane [Thu, 25 Sep 2025 17:29:02 +0000 (13:29 -0400)]
Add minimal sleep to stats isolation test functions.
The functions test_stat_func() and test_stat_func2() had empty
function bodies, so that they took very little time to run. This made
it possible that on machines with relatively low timer resolution the
functions could return before the clock advanced, making the test fail
(as seen on buildfarm members fruitcrow and hamerkop).
To avoid that, pg_sleep for 10us during the functions. As far as we
can tell, all current hardware has clock resolution much less than
that. (The current implementation of pg_sleep will round it up to
1ms anyway, but someday that might get improved.)
Author: Michael Banck <mbanck@gmx.net> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/68d413a3.a70a0220.24c74c.8be9@mx.google.com
Backpatch-through: 15
Robert Haas [Thu, 25 Sep 2025 15:43:52 +0000 (11:43 -0400)]
Fix array allocation bugs in SetExplainExtensionState.
If we already have an extension_state array but see a new extension_id
much larger than the highest the extension_id we've previously seen,
the old code might have failed to expand the array to a large enough
size, leading to disaster. Also, if we don't have an extension array
at all and need to create one, we should make sure that it's big enough
that we don't have to resize it instantly.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/2949591.1758570711@sss.pgh.pa.us
Backpatch-through: 18
When defining an injection point there is no need to wrap the definition
with USE_INJECTION_POINT guards, the INJECTION_POINT macro is available
in all builds. Remove to make the code consistent.
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/OSCPR01MB14966C8015DEB05ABEF2CE077F51FA@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17
Don't include execnodes.h in replication/conflict.h
... which silently propagates a lot of headers into many places
via pgstat.h, as evidenced by the variety of headers that this patch
needs to add to seemingly random places. Add a minimum of typedefs to
conflict.h to be able to remove execnodes.h, and fix the fallout.
Amit Kapila [Wed, 24 Sep 2025 04:00:15 +0000 (04:00 +0000)]
Fix LOCK_TIMEOUT handling during parallel apply.
Previously, the parallel apply worker used SIGINT to receive a graceful
shutdown signal from the leader apply worker. However, SIGINT is also used
by the LOCK_TIMEOUT handler to trigger a query-cancel interrupt. This
overlap caused the parallel apply worker to miss LOCK_TIMEOUT signals,
leading to incorrect behavior during lock wait/contention.
This patch resolves the conflict by switching the graceful shutdown signal
from SIGINT to SIGUSR2.
Reported-by: Zane Duffield <duffieldzane@gmail.com> Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/CACMiCkXyC4au74kvE2g6Y=mCEF8X6r-Ne_ty4r7qWkUjRE4+oQ@mail.gmail.com
Michael Paquier [Sun, 21 Sep 2025 23:03:25 +0000 (08:03 +0900)]
Fix meson build with -Duuid=ossp when using version older than 0.60
The package for the UUID library may be named "uuid" or "ossp-uuid", and
meson.build has been using a single call of dependency() with multiple
names, something only supported since meson 0.60.0.
The minimum version of meson supported by Postgres is 0.57.2 on HEAD,
since f039c2244110, and 0.54 on stable branches down to 16.
Author: Oreo Yang <oreo.yang@hotmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/OS3P301MB01656E6F91539770682B1E77E711A@OS3P301MB0165.JPNP301.PROD.OUTLOOK.COM
Backpatch-through: 16
Michael Paquier [Mon, 22 Sep 2025 00:04:20 +0000 (09:04 +0900)]
Revert "Fix meson build with -Duuid=ossp when using version older than 0.60"
This reverts commit 5f565b0aee90 temporarily on v18. This branch is in
a release freeze state until tagged. Let's re-add this commit once the
release is out. The other branches are left untouched.
Michael Paquier [Sun, 21 Sep 2025 23:03:25 +0000 (08:03 +0900)]
Fix meson build with -Duuid=ossp when using version older than 0.60
The package for the UUID library may be named "uuid" or "ossp-uuid", and
meson.build has been using a single call of dependency() with multiple
names, something only supported since meson 0.60.0.
The minimum version of meson supported by Postgres is 0.57.2 on HEAD,
since f039c2244110, and 0.54 on stable branches down to 16.
Author: Oreo Yang <oreo.yang@hotmail.com> Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/OS3P301MB01656E6F91539770682B1E77E711A@OS3P301MB0165.JPNP301.PROD.OUTLOOK.COM
Backpatch-through: 16
David Rowley [Fri, 19 Sep 2025 11:35:53 +0000 (23:35 +1200)]
Improve wording in a few comments
Initially this was to fix the "catched" typo, but I (David) wasn't quite
clear on what the previous comment meant about being "effective". I
expect this means efficiency, so I've reworded the comment to indicate
that.
While this is only a comment fixup, for the sake of possibly minimizing
possible future backpatching pain, I've opted to backpatch to 18 since
this code is new to that version and the release isn't out the door yet.
Author: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAHewXNmSYWPud1sfBvpKbCJeRkWeZYuqatxtV9U9LvAFXBEiBw@mail.gmail.com
Backpatch-through: 18
Amit Langote [Fri, 19 Sep 2025 02:39:06 +0000 (11:39 +0900)]
Fix EPQ crash from missing partition pruning state in EState
Commit bb3ec16e14 moved partition pruning metadata into PlannedStmt.
At executor startup this metadata is used to initialize the EState
fields es_part_prune_infos, es_part_prune_states, and
es_part_prune_results. EvalPlanQualStart() failed to copy those
fields into the child EState, causing NULL dereference when Append
ran partition pruning during a recheck. This can occur with DELETE
or UPDATE on partitioned tables that use runtime pruning, e.g. with
generic plans.
Fix by copying all partition pruning state into the EPQ estate.
Add an isolation test that reproduces the crash with concurrent
UPDATE and DELETE on a partitioned table, where the DELETE session
hits the crash during its EPQ recheck after the UPDATE commits.
Bug: #19056 Reported-by: Fei Changhong <feichanghong@qq.com> Diagnozed-by: Fei Changhong <feichanghong@qq.com>
Author: David Rowley <dgrowleyml@gmail.com> Co-authored-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/19056-a677cef9b54d76a0%40postgresql.org
pg_restore: Fix security label handling with --no-publications/subscriptions.
Previously, pg_restore did not skip security labels on publications or
subscriptions even when --no-publications or --no-subscriptions was specified.
As a result, it could issue SECURITY LABEL commands for objects that were
never created, causing those commands to fail.
This commit fixes the issue by ensuring that security labels on publications
and subscriptions are also skipped when the corresponding options are used.
Tom Lane [Wed, 17 Sep 2025 20:32:57 +0000 (16:32 -0400)]
Calculate agglevelsup correctly when Aggref contains a CTE.
If an aggregate function call contains a sub-select that has
an RTE referencing a CTE outside the aggregate, we must treat
that reference like a Var referencing the CTE's query level
for purposes of determining the aggregate's level. Otherwise
we might reach the nonsensical conclusion that the aggregate
should be evaluated at some query level higher than the CTE,
ending in a planner error or a broken plan tree that causes
executor failures.
Bug: #19055 Reported-by: BugForge <dllggyx@outlook.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19055-6970cfa8556a394d@postgresql.org
Backpatch-through: 13
Michael Paquier [Wed, 17 Sep 2025 01:16:10 +0000 (10:16 +0900)]
injection_points: Fix incrementation of variable-numbered stats
The pending entry was not used when incrementing its data, directly
manipulating the shared memory pointer, without even locking it. This
could mean losing statistics under concurrent activity. The flush
callback was a no-op.
This code serves as a base template for extensions for the custom
cumulative statistics, so let's be clean and use a pending entry for the
incrementations, whose data is then flushed to the corresponding entry
in the shared hashtable when all the stats are reported, in its own
flush callback.
Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0v0U0yhPbY+bqChomkPbyUrRQ3rQXnZf_SB-svDiQOpgQ@mail.gmail.com
Backpatch-through: 18
Michael Paquier [Wed, 17 Sep 2025 00:33:35 +0000 (09:33 +0900)]
Fix shared memory calculation size of PgAioCtl
The shared memory size was calculated based on an offset of io_handles,
which is itself a pointer included in the structure. We tend to
overestimate the shared memory size overall, so this was unlikely an
issue in practice, but let's be correct and use the full size of the
structure in the calculation, so as the pointer for io_handles is
included.
David Rowley [Wed, 17 Sep 2025 00:19:51 +0000 (12:19 +1200)]
Add missing EPQ recheck for TID Range Scan
The EvalPlanQual recheck for TID Range Scan wasn't rechecking the TID qual
still passed after following update chains. This could result in tuples
being updated or deleted by plans using TID Range Scans where the ctid of
the new (updated) tuple no longer matches the clause of the scan. This
isn't desired behavior, and isn't consistent with what would happen if the
chosen plan had used an Index or Seq Scan, and that could lead to hard to
predict behavior for scans that contain TID quals and other quals as the
planner has freedom to choose TID Range or some other non-TID scan method
for such queries, and the chosen plan could change at any moment.
Here we fix this by properly implementing the recheck function for TID
Range Scans.
Backpatch to 14, where TID Range Scans were added
Reported-by: Sophie Alpert <pg@sophiebits.com>
Author: Sophie Alpert <pg@sophiebits.com>
Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/4a6268ff-3340-453a-9bf5-c98d51a6f729@app.fastmail.com
Backpatch-through: 14
David Rowley [Tue, 16 Sep 2025 23:49:26 +0000 (11:49 +1200)]
Add missing EPQ recheck for TID Scan
The EvalPlanQual recheck for TID Scan wasn't rechecking the TID qual
still passed after following update chains. This could result in tuples
being updated or deleted by plans using TID Scans where the ctid of the
new (updated) tuple no longer matches the clause of the scan. This isn't
desired behavior, and isn't consistent with what would happen if the
chosen plan had used an Index or Seq Scan, and that could lead to hard to
predict behavior for scans that contain TID quals and other quals as the
planner has freedom to choose TID or some other scan method for such
queries, and the chosen plan could change at any moment.
Here we fix this by properly implementing the recheck function for TID
Scans.
Backpatch to 13, oldest supported version
Reported-by: Sophie Alpert <pg@sophiebits.com>
Author: Sophie Alpert <pg@sophiebits.com>
Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/4a6268ff-3340-453a-9bf5-c98d51a6f729@app.fastmail.com
Backpatch-through: 13
Tom Lane [Tue, 16 Sep 2025 17:05:53 +0000 (13:05 -0400)]
Revert "Avoid race condition between "GRANT role" and "DROP ROLE"".
This reverts commit 98fc31d6499163a0a781aa6f13582a07f09cd7c6.
That change allowed DROP OWNED BY to drop grants of the target
role to other roles, arguing that nobody would need those
privileges anymore. But that's not so: if you're not superuser,
you still need admin privilege on the target role so you can
drop it.
It's not clear whether or how the dependency-based approach
to solving the original problem can be adapted to keep these
grants. Since v18 release is fast approaching, the sanest
thing to do seems to be to revert this patch for now. The
race-condition problem is low severity and not worth taking
risks for.
I didn't force a catversion bump in 98fc31d64, so I won't do
so here either.
Reported-by: Dipesh Dhameliya <dipeshdhameliya125@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CABgZEgczOFicCJoqtrH9gbYMe_BV3Hq8zzCBRcMgmU6LRsihUA@mail.gmail.com
Backpatch-through: 18
Fix pg_dump COMMENT dependency for separate domain constraints.
The COMMENT should depend on the separately-dumped constraint, not the
domain. Sufficient restore parallelism might fail the COMMENT command
by issuing it before the constraint exists. Back-patch to v13, like
commit 0858f0f96ebb891c8960994f023ed5a17b758a38.
Tom Lane [Tue, 16 Sep 2025 14:09:49 +0000 (10:09 -0400)]
Add regression expected-files for older OpenSSL in FIPS mode.
In our previous discussions around making our regression tests
pass in FIPS mode, we concluded that we didn't need to support
the different error message wording observed with pre-3.0 OpenSSL.
However there are still a few LTS distributions soldiering along
with such versions, and now we have some in the buildfarm.
So let's add the variant expected-files needed to make them happy.
This commit only covers the core regression tests. Previous
discussion suggested that we might need some adjustments in
contrib as well, but it's not totally clear to me what those
would be. Rather than work it out from first principles,
I'll wait to see what the buildfarm shows.
Back-patch to v17 which is the oldest branch that claims
to support this case.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/443709.1757876535@sss.pgh.pa.us
Backpatch-through: 17
Richard Guo [Tue, 16 Sep 2025 09:42:20 +0000 (18:42 +0900)]
Treat JsonConstructorExpr as non-strict
JsonConstructorExpr can produce non-NULL output with a NULL input, so
it should be treated as a non-strict construct. Failing to do so can
lead to incorrect query behavior.
For example, in the reported case, when pulling up a subquery that is
under an outer join, if the subquery's target list contains a
JsonConstructorExpr that uses subquery variables and it is mistakenly
treated as strict, it will be pulled up without being wrapped in a
PlaceHolderVar. As a result, the expression will be evaluated at the
wrong place and will not be forced to null when the outer join should
do so.
Back-patch to v16 where JsonConstructorExpr was introduced.
Bug: #19046 Reported-by: Runyuan He <runyuan@berkeley.edu>
Author: Tender Wang <tndrwang@gmail.com> Co-authored-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/19046-765b6602b0a8cfdf@postgresql.org
Backpatch-through: 16
Peter Eisentraut [Tue, 16 Sep 2025 06:59:12 +0000 (08:59 +0200)]
Move pg_int64 back to postgres_ext.h
Fix for commit 3c86223c998. That commit moved the typedef of pg_int64
from postgres_ext.h to libpq-fe.h, because the only remaining place
where it might be used is libpq users, and since the type is obsolete,
the intent was to limit its scope.
The problem is that if someone builds an extension against an
older (pre-PG18) server version and a new (PG18) libpq, they might get
two typedefs, depending on include file order. This is not allowed
under C99, so they might get warnings or errors, depending on the
compiler and options. The underlying types might also be
different (e.g., long int vs. long long int), which would also lead to
errors. This scenario is plausible when using the standard Debian
packaging, which provides only the newest libpq but per-major-version
server packages.
The fix is to undo that part of commit 3c86223c998. That way, the
typedef is in the same header file across versions. At least, this is
the safest fix doable before PostgreSQL 18 releases.
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/25144219-5142-4589-89f8-4e76948b32db%40eisentraut.org
pg_dump: Fix dumping of security labels on subscriptions and event triggers.
Previously, pg_dump incorrectly queried pg_seclabel to retrieve security labels
for subscriptions, which are stored in pg_shseclabel as they are global objects.
This could result in security labels for subscriptions not being dumped.
This commit fixes the issue by updating pg_dump to query the pg_seclabels view,
which aggregates entries from both pg_seclabel and pg_shseclabel.
While querying pg_shseclabel directly for subscriptions was an alternative,
using pg_seclabels is simpler and sufficient.
In addition, pg_dump is updated to dump security labels on event triggers,
which were previously omitted.
Peter Eisentraut [Tue, 16 Sep 2025 05:23:50 +0000 (07:23 +0200)]
Fix incorrect const qualifier
Commit 7202d72787d added in passing some const qualifiers, but the one
on the postmaster_child_launch() startup_data argument was incorrect,
because the function itself modifies the pointed-to data. This is
hidden from the compiler because of casts. The qualifiers on the
functions called by postmaster_child_launch() are still correct.
pg_restore: Fix comment handling with --no-policies.
Previously, pg_restore did not skip comments on policies even when
--no-policies was specified. As a result, it could issue COMMENT commands
for policies that were never created, causing those commands to fail.
This commit fixes the issue by ensuring that comments on policies
are also skipped when --no-policies is used.
Backpatch to v18, where --no-policies was added in pg_restore.
pg_restore: Fix comment handling with --no-publications / --no-subscriptions.
Previously, pg_restore did not skip comments on publications or subscriptions
even when --no-publications or --no-subscriptions was specified. As a result,
it could issue COMMENT commands for objects that were never created,
causing those commands to fail.
This commit fixes the issue by ensuring that comments on publications and
subscriptions are also skipped when the corresponding options are used.
Peter Eisentraut [Mon, 15 Sep 2025 14:27:50 +0000 (16:27 +0200)]
Expand virtual generated columns in constraint expressions
Virtual generated columns in constraint expressions need to be
expanded because the optimizer matches these expressions to qual
clauses. Failing to do so can cause us to miss opportunities for
constraint exclusion.
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/204804c0-798f-4c72-bd1f-36116024fda3%40eisentraut.org
The previous change (commit f225473cbae) was still not on target,
because it talked about relation kinds, which are not what is being
checked here. Provide a more accurate message.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxEZ48toGH0Em_6vdsT57Y3L8pLF=DZCQ_gCii6=C3MeXw@mail.gmail.com
Peter Eisentraut [Mon, 15 Sep 2025 05:25:22 +0000 (07:25 +0200)]
Hide duplicate names from extension views
If extensions of equal names were installed in different directories
in the path, the views pg_available_extensions and
pg_available_extension_versions would show all of them, even though
only the first one was actually reachable by CREATE EXTENSION. To
fix, have those views skip extensions found later in the path if they
have names already found earlier.
Also add a bit of documentation that only the first extension in the
path can be used.
Peter Geoghegan [Sun, 14 Sep 2025 01:01:31 +0000 (21:01 -0400)]
nbtree: Always set skipScan flag on rescan.
The TimescaleDB extension expects to be able to change an nbtree scan's
keys across rescans. The issue arises in the extension's implementation
of loose index scan. This is arguably a misuse of the index AM API,
though apparently it worked until recently. It stopped working when the
skipScan flag was added to BTScanOpaqueData by commit 8a510275, though.
The flag wouldn't reliably track whether the scan (actually, the current
rescan) has any skip arrays, leading to confusion in _bt_set_startikey.
nbtree preprocessing will now defensively initialize the scan's skipScan
flag in all cases, including the case where _bt_preprocess_array_keys
returns early due to the (re)scan not using arrays. While nbtree isn't
obligated to support this use case (at least not according to my reading
of the index AM API), it still seems like a good idea to be consistent
here, on general robustness grounds.
Tom Lane [Sat, 13 Sep 2025 20:55:51 +0000 (16:55 -0400)]
Amend recent fix for SIMILAR TO regex conversion.
Commit e3ffc3e91 fixed the translation of character classes in
SIMILAR TO regular expressions. Unfortunately the fix broke a corner
case: if there is an escape character right after the opening bracket
(for example in "[\q]"), a closing bracket right after the escape
sequence would not be seen as closing the character class.
There were two more oversights: a backslash or a nested opening bracket
right at the beginning of a character class should remove the special
meaning from any following caret or closing bracket.
This bug suggests that this code needs to be more readable, so also
rename the variables "charclass_depth" and "charclass_start" to
something more meaningful, rewrite an "if" cascade to be more
consistent, and improve the commentary.
Reported-by: Dominique Devienne <ddevienne@gmail.com> Reported-by: Stephan Springl <springl-psql@bfw-online.de>
Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFCRh-8NwJd0jq6P=R3qhHyqU7hw0BTor3W0SvUcii24et+zAw@mail.gmail.com
Backpatch-through: 13
Tom Lane [Fri, 12 Sep 2025 21:43:15 +0000 (17:43 -0400)]
Fix oversights in pg_event_trigger_dropped_objects() fixes.
Commit a0b99fc12 caused pg_event_trigger_dropped_objects()
to not fill the object_name field for schemas, which it
should have; and caused it to fill the object_name field
for default values, which it should not have.
In addition, triggers and RLS policies really should behave
the same way as we're making column defaults do; that is,
they should have is_temporary = true if they belong to a
temporary table.
Fix those things, and upgrade event_trigger.sql's woefully
inadequate test coverage of these secondary output columns.
As before, back-patch only to v15.
Reported-by: Sergey Shinderuk <s.shinderuk@postgrespro.ru>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/bd7b4651-1c26-4d30-832b-f942fabcb145@postgrespro.ru
Backpatch-through: 15
Peter Geoghegan [Fri, 12 Sep 2025 17:22:58 +0000 (13:22 -0400)]
Always commute strategy when preprocessing DESC keys.
A recently added nbtree preprocessing step failed to account for the
fact that DESC columns already had their B-Tree strategy number commuted
at this point in preprocessing. As a result, preprocessing could output
a set of scan keys where one or more keys had the correct strategy
number, but used the wrong comparison routine.
To fix, make the faulty code path that looks up a more restrictive
replacement operator/comparison routine commute its requested inequality
strategy (while outputting the transformed strategy number as before).
This makes the final transformed scan key comport with the approach
preprocessing has always used to deal with DESC columns (which is
described by comments above _bt_fix_scankey_strategy).
Oversight in commit commit b3f1a13f, which made nbtree preprocessing
perform transformations on skip array inequalities that can reduce the
total number of index searches.
Andres Freund [Fri, 12 Sep 2025 14:18:31 +0000 (10:18 -0400)]
ci: openbsd: Increase RAM disk's size
Its size was ~3.8GB before, which sometimes was not enough. OpenBSD CI task
often were failing due to no space left on device. Increase the RAM disk size
to ~4.6 GB.
Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/CAN55FZ2XVVPJRJmGB2DsL3gOrOinWh=HWvj6GO1cHzJ=6LwTag@mail.gmail.com
Backpatch-through: 18, where openbsd was added to CI