Thomas Munro [Sat, 9 Aug 2025 00:33:06 +0000 (12:33 +1200)]
Fix rare bug in read_stream.c's split IO handling.
The internal queue of buffers could become corrupted in a rare edge case
that failed to invalidate an entry, causing a stale buffer to be
"forwarded" to StartReadBuffers(). This is a simple fix for the
immediate problem.
A small API change might be able to remove this and related fragility
entirely, but that will have to wait a bit.
postgres_fdw and dblink should check if backend has MyProcPort
before checking ->has_scram_keys. MyProcPort is NULL in background
workers. So this could crash for example if a background worker
accessed a suitable configured foreign table.
Author: Alexander Pyhalov <a.pyhalov@postgrespro.ru> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/27b29a35-9b96-46a9-bc1a-914140869dac%40gmail.com
Jacob Champion [Fri, 8 Aug 2025 15:44:56 +0000 (08:44 -0700)]
oauth: Track total call count during a client flow
Tracking down the bugs that led to the addition of comb_multiplexer()
and drain_timer_events() was difficult, because an inefficient flow is
not visibly different from one that is working properly. To help
maintainers notice when something has gone wrong, track the number of
calls into the flow as part of debug mode, and print the total when the
flow finishes.
A new test makes sure the total count is less than 100. (We expect
something on the order of 10.) This isn't foolproof, but it is able to
catch several regressions in the logic of the prior two commits, and
future work to add TLS support to the oauth_validator test server should
strengthen it as well.
Jacob Champion [Fri, 8 Aug 2025 15:44:52 +0000 (08:44 -0700)]
oauth: Remove expired timers from the multiplexer
In a case similar to the previous commit, an expired timer can remain
permanently readable if Curl does not remove the timeout itself. Since
that removal isn't guaranteed to happen in real-world situations,
implement drain_timer_events() to reset the timer before calling into
drive_request().
Moving to drain_timer_events() happens to fix a logic bug in the
previous caller of timer_expired(), which treated an error condition as
if the timer were expired instead of bailing out.
The previous implementation of timer_expired() gave differing results
for epoll and kqueue if the timer was reset. (For epoll, a reset timer
was considered to be expired, and for kqueue it was not.) This didn't
previously cause problems, since timer_expired() was only called while
the timer was known to be set, but both implementations now use the
kqueue logic.
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com
Jacob Champion [Fri, 8 Aug 2025 15:44:46 +0000 (08:44 -0700)]
oauth: Ensure unused socket registrations are removed
If Curl needs to switch the direction of a socket's registration (e.g.
from CURL_POLL_IN to CURL_POLL_OUT), it expects the old registration to
be discarded. For epoll, this happened via EPOLL_CTL_MOD, but for
kqueue, the old registration would remain if it was not explicitly
removed by Curl.
Explicitly remove the opposite-direction event during registrations. (If
that event doesn't exist, we'll just get an ENOENT, which will be
ignored by the same code that handles CURL_POLL_REMOVE.) A few
assertions are also added to strengthen the relationship between the
number of events added, the number of events pulled off the queue, and
the lengths of the kevent arrays.
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com
Jacob Champion [Fri, 8 Aug 2025 15:44:37 +0000 (08:44 -0700)]
oauth: Remove stale events from the kqueue multiplexer
If a socket is added to the kqueue, becomes readable/writable, and
subsequently becomes non-readable/writable again, the kqueue itself will
remain readable until either the socket registration is removed, or the
stale event is cleared via a call to kevent().
In many simple cases, Curl itself will remove the socket registration
quickly, but in real-world usage, this is not guaranteed to happen. The
kqueue can then remain stuck in a permanently readable state until the
request ends, which results in pointless wakeups for the client and
wasted CPU time.
Implement comb_multiplexer() to call kevent() and unstick any stale
events that would cause unnecessary callbacks. This is called right
after drive_request(), before we return control to the client to wait.
Suggested-by: Thomas Munro <thomas.munro@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CAOYmi+nDZxJHaWj9_jRSyf8uMToCADAmOfJEggsKW-kY7aUwHA@mail.gmail.com
Fix incorrect lack of Datum conversion in _int_matchsel()
The code used
return (Selectivity) 0.0;
where
PG_RETURN_FLOAT8(0.0);
would be correct.
On 64-bit systems, these are pretty much equivalent, but on 32-bit
systems, PG_RETURN_FLOAT8() correctly produces a pointer, but the old
wrong code would return a null pointer, possibly leading to a crash
elsewhere.
We think this code is actually not reachable because bqarr_in won't
accept an empty query, and there is no other function that will
create query_int values. But better be safe and not let such
incorrect code lie around.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
Etsuro Fujita [Fri, 8 Aug 2025 08:35:00 +0000 (17:35 +0900)]
Fix oversight in FindTriggerIncompatibleWithInheritance.
This function is called from ATExecAttachPartition/ATExecAddInherit,
which prevent tables with row-level triggers with transition tables from
becoming partitions or inheritance children, to check if there is such a
trigger on the given table, but failed to check if a found trigger is
row-level, causing the caller functions to needlessly prevent a table
with only a statement-level trigger with transition tables from becoming
a partition or inheritance child. Repair.
Fujii Masao [Fri, 8 Aug 2025 05:36:39 +0000 (14:36 +0900)]
pg_dump: Fix incorrect parsing of object types in pg_dump --filter.
Previously, pg_dump --filter could misinterpret invalid object types
in the filter file as valid ones. For example, the invalid object type
"table-data" (likely a typo for the valid "table_data") could be
mistakenly recognized as "table", causing pg_dump to succeed
when it should have failed.
This happened because pg_dump identified keywords as sequences of
ASCII alphabetic characters, treating non-alphabetic characters
(like hyphens) as keyword boundaries. As a result, "table-data" was
parsed as "table".
To fix this, pg_dump --filter now treats keywords as strings of
non-whitespace characters, ensuring invalid types like "table-data"
are correctly rejected.
Back-patch to v17, where the --filter option was introduced.
Etsuro Fujita [Fri, 8 Aug 2025 01:50:01 +0000 (10:50 +0900)]
Disallow collecting transition tuples from child foreign tables.
Commit 9e6104c66 disallowed transition tables on foreign tables, but
failed to account for cases where a foreign table is a child table of a
partitioned/inherited table on which transition tables exist, leading to
incorrect transition tuples collected from such foreign tables for
queries on the parent table triggering transition capture. This
occurred not only for inherited UPDATE/DELETE but for partitioned INSERT
later supported by commit 3d956d956, which should have handled it at
least for the INSERT case, but didn't.
To fix, modify ExecAR*Triggers to throw an error if the given relation
is a foreign table requesting transition capture. Also, this commit
fixes make_modifytable so that in case of an inherited UPDATE/DELETE
triggering transition capture, FDWs choose normal operations to modify
child foreign tables, not DirectModify; which is needed because they
would otherwise skip the calls to ExecAR*Triggers at execution, causing
unexpected behavior.
Michael Paquier [Fri, 8 Aug 2025 00:07:49 +0000 (09:07 +0900)]
Add information about "generation" when dropping twice pgstats entry
Dropping twice a pgstats entry should not happen, and the error report
generated was missing the "generation" counter (tracking when an entry
is reused) that has been added in 818119afccd3.
Like d92573adcb02, backpatch down to v15 where this information is
useful to have, to gather more information from instances where the
problem shows up. A report has shown that this error path has been
reached on a standby based on 17.3, for a relation stats entry and an
OID close to wraparound.
Jacob Champion [Thu, 7 Aug 2025 22:31:36 +0000 (15:31 -0700)]
meson: Fix install-quiet after clean
libpq-oauth was missing from the installed_targets list, so
$ ninja clean && ninja install-quiet
failed with the error message
ERROR: File 'src/interfaces/libpq-oauth/libpq-oauth.a' could not be found
It seems a little odd to have to tell Meson what's missing, since it
clearly knows how to build that file during regular installation. But
the "quiet" variant we've created must use --no-rebuild, to avoid
spawning concurrent ninja processes that would step on each other.
Reported-by: Andres Freund <andres@anarazel.de>
Backpatch-through: 18
Discussion: https://postgr.es/m/hbpqdwxkfnqijaxzgdpvdtp57s7gwxa5d6sbxswovjrournlk6%404jnb2gzan4em
Tom Lane [Thu, 7 Aug 2025 22:04:45 +0000 (18:04 -0400)]
doc: add float as an alias for double precision.
Although the "Floating-Point Types" section says that "float" data
type is taken to mean "double precision", this information was not
reflected in the data type table that lists all data type aliases.
Reported-by: alexander.kjall@hafslund.no
Author: Euler Taveira <euler@eulerto.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/175456294638.800.12038559679827947313@wrigleys.postgresql.org
Backpatch-through: 13
Small touch-up on commits 25505082f0e and 50fd428b2b9. Fix the
formatting of the example messages in the documentation and adjust the
wording to match the code.
Use Min(NBuffers, MAX_CHECKPOINT_REQUESTS) instead of NBuffers in
CheckpointerShmemSize() to match the actual array size limit set in
CheckpointerShmemInit(). This prevents wasting shared memory when
NBuffers > MAX_CHECKPOINT_REQUESTS. Also, fix the comment.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1439188.1754506714%40sss.pgh.pa.us
Author: Xuneng Zhou <xunengzhou@gmail.com> Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Revert "Clarify documentation for the initcap function"
This reverts commit 1fe9e3822c4e574aa526b99af723e61e03f36d4f. That commit
was a documentation improvement, not a bug fix. We don't normally backpatch
such changes.
John Naylor [Thu, 7 Aug 2025 10:12:44 +0000 (17:12 +0700)]
Update ICU C++ API symbols
Recent ICU versions have added U_SHOW_CPLUSPLUS_HEADER_API, and we need
to set this to zero as well to hide the ICU C++ APIs from pg_locale.h
Per discussion, we want cpluspluscheck to work cleanly in backbranches,
so backpatch both this and its predecessor commit ed26c4e25a4 to all
supported versions.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1115793.1754414782%40sss.pgh.pa.us
Backpatch-through: 13
Michael Paquier [Thu, 7 Aug 2025 02:49:29 +0000 (11:49 +0900)]
Improve tests of date_trunc() with infinity and unsupported units
Commit d85ce012f99f has added some new error handling code to
date_trunc() of timestamp, timestamptz, and interval with infinite
values.
However, the new test cases added by that commit did not actually test
all of the new code, missing coverage for the following cases:
1) For timestamp without time zone:
1-1) infinite value with valid unit
1-2) infinite value with unsupported unit
1-3) finite value with unsupported unit, for a code path older than d85ce012f99f.
2) For timestamp with time zone, without a time zone specified for the
truncation:
2-1) infinite value with valid unit
2-2) infinite value with unsupported unit
2-3) finite value with unsupported unit, for a code path older than d85ce012f99f.
3) For timestamp with time zone, with a time zone specified for the
truncation:
3-1) infinite value with valid unit.
3-2) infinite value with unsupported unit.
This commit also provides coverage for the bug fixed in 2242b26ce472,
through cases 2-1) and 3-1), when using an infinite value with a valid
unit, with[out] the optional time zone parameter used for the
truncation.
Author: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/2d320b6f-b4af-4fbc-9eec-5d0fa15d187b@eisentraut.org
Discussion: https://postgr.es/m/4bf60a84-2862-4a53-acd5-8eddf134a60e@eisentraut.org
Backpatch-through: 18
Michael Paquier [Thu, 7 Aug 2025 02:02:09 +0000 (11:02 +0900)]
Fix incorrect Datum conversion in timestamptz_trunc_internal()
The code used a PG_RETURN_TIMESTAMPTZ() where the return type is
TimestampTz and not a Datum.
On 64-bit systems, there is no effect since this just ends up casting
64-bit integers back and forth. On 32-bit systems, timestamptz is
pass-by-reference. PG_RETURN_TIMESTAMPTZ() allocates new memory and
returns the address, meaning that the caller could interpret this as a
timestamp value.
The effect is using "date_trunc(..., 'infinity'::timestamptz) will
return random values (instead of the correct return value 'infinity').
Author: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2d320b6f-b4af-4fbc-9eec-5d0fa15d187b@eisentraut.org
Discussion: https://postgr.es/m/4bf60a84-2862-4a53-acd5-8eddf134a60e@eisentraut.org
Backpatch-through: 18
These were introduced (commit efdc7d74753) at the same time as we were
moving to using the standard inttypes.h format macros (commit a0ed19e0a9e). It doesn't seem useful to keep a new already-deprecated
interface like this with only a few users, so remove the new symbols
again and have the callers use PRIx64.
(Also, INT64_HEX_FORMAT was kind of a misnomer, since hex formats all
use unsigned types.)
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/0ac47b5d-e5ab-4cac-98a7-bdee0e2831e4%40eisentraut.org
Fujii Masao [Wed, 6 Aug 2025 07:47:20 +0000 (16:47 +0900)]
doc: Recommend ANALYZE after ALTER TABLE ... SET EXPRESSION AS.
ALTER TABLE ... SET EXPRESSION AS removes statistics for the target column,
so running ANALYZE afterward is recommended. But this was previously not
documented, even though a similar recommendation exists for
ALTER TABLE ... SET DATA TYPE, which also clears the column's statistics.
This commit updates the documentation to include the ANALYZE recommendation
for SET EXPRESSION AS.
Since v18, virtual generated columns are supported, and these columns never
have statistics. Therefore, ANALYZE is not needed after SET DATA TYPE or
SET EXPRESSION AS when used on virtual generated columns. This commit also
updates the documentation to clarify that ANALYZE is unnecessary in such cases.
Back-patch the ANALYZE recommendation for SET EXPRESSION AS to v17
where the feature was introduced, and the note about virtual generated
columns to v18 where those columns were added.
Tom Lane [Tue, 5 Aug 2025 20:51:10 +0000 (16:51 -0400)]
Fix incorrect return value in brin_minmax_multi_distance_numeric().
The result of "DirectFunctionCall1(numeric_float8, d)" is already in
Datum form, but the code was incorrectly applying PG_RETURN_FLOAT8()
to it. On machines where float8 is pass-by-reference, this would
result in complete garbage, since an unpredictable pointer value
would be treated as an integer and then converted to float. It's not
entirely clear how much of a problem would ensue on 64-bit hardware,
but certainly interpreting a float8 bitpattern as uint64 and then
converting that to float isn't the intended behavior.
As luck would have it, even the complete-garbage case doesn't break
BRIN indexes, since the results are only used to make choices about
how to merge values into ranges: at worst, we'd make poor choices
resulting in an inefficient index. Doubtless that explains the lack
of field complaints. However, users with BRIN indexes that use the
numeric_minmax_multi_ops opclass may wish to reindex in hopes of
making their indexes more efficient.
Author: Peter Eisentraut <peter@eisentraut.org> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2093712.1753983215@sss.pgh.pa.us
Backpatch-through: 14
Jeff Davis [Tue, 5 Aug 2025 16:06:05 +0000 (09:06 -0700)]
Don't copy datlocale from template unless provider matches.
During CREATE DATABASE, if changing the locale provider, require that
a new locale is specified rather than trying to reinterpret the
template's locale using the new provider.
This only affects the behavior when the template uses the builtin
provider and CREATE DATABASE specifies the ICU provider without
specifying the locale. Previously, that may have succeeded due to
loose validation by ICU, whereas now that will cause an error. Because
it can cause an error, backport only to unreleased versions.
Amit Kapila [Tue, 5 Aug 2025 09:21:50 +0000 (09:21 +0000)]
Throw ERROR when publish_generated_columns is specified without a value.
Previously, specifying the publication option 'publish_generated_columns'
without an explicit value would incorrectly default to 'stored', which is
not the intended behavior.
This patch fixes the issue by raising an ERROR when no value is provided
for 'publish_generated_columns', ensuring that users must explicitly
specify a valid option.
Author: Peter Smith <smithpb2250@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com>
Backpatch-through: 18, where it was introduced
Discussion: https://postgr.es/m/CAHut+PsCUCWiEKmB10DxhoPfXbF6jw5RD9ib2LuaQeA_XraW7w@mail.gmail.com
Fujii Masao [Mon, 4 Aug 2025 11:51:42 +0000 (20:51 +0900)]
Avoid unexpected shutdown when sync_replication_slots is enabled.
Previously, enabling sync_replication_slots while wal_level was not set
to logical could cause the server to shut down. This was because
the postmaster performed a configuration check before launching
the slot synchronization worker and raised an ERROR if the settings
were incompatible. Since ERROR is treated as FATAL in the postmaster,
this resulted in the entire server shutting down unexpectedly.
This commit changes the postmaster to log that message with a LOG-level
instead of raising an ERROR, allowing the server to continue running
even with the misconfiguration.
Back-patch to v17, where slot synchronization was introduced.
Reported-by: Hugo DUBOIS <hdubois@scaleway.com>
Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Hugo DUBOIS <hdubois@scaleway.com> Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Discussion: https://postgr.es/m/CAH0PTU_pc3oHi__XESF9ZigCyzai1Mo3LsOdFyQA4aUDkm01RA@mail.gmail.com
Backpatch-through: 17
Álvaro Herrera [Mon, 4 Aug 2025 11:26:45 +0000 (13:26 +0200)]
doc: mention unusability of dropped CHECK to verify NOT NULL
It's possible to use a CHECK (col IS NOT NULL) constraint to skip
scanning a table for nulls when adding a NOT NULL constraint on the same
column. However, if the CHECK constraint is dropped on the same command
that the NOT NULL is added, this fails, i.e., makes the NOT NULL addition
slow. The best we can do about it at this stage is to document this so
that users aren't taken by surprise.
(In Postgres 18 you can directly add the NOT NULL constraint as NOT
VALID instead, so there's no longer much use for the CHECK constraint,
therefore no point in building mechanism to support the case better.)
Reported-by: Andrew <psy2000usa@yahoo.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/175385113607.786.16774570234342968908@wrigleys.postgresql.org
Fujii Masao [Sun, 3 Aug 2025 01:49:03 +0000 (10:49 +0900)]
Fix assertion failure in pgbench when handling multiple pipeline sync messages.
Previously, when running pgbench in pipeline mode with a custom script
that triggered retriable errors (e.g., serialization errors),
an assertion failure could occur:
Assertion failed: (res == ((void*)0)), function discardUntilSync, file pgbench.c, line 3515.
The root cause was that pgbench incorrectly assumed only a single
pipeline sync message would be received at the end. In reality,
multiple pipeline sync messages can be sent and must be handled properly.
This commit fixes the issue by updating pgbench to correctly process
multiple pipeline sync messages, preventing the assertion failure.
Etsuro Fujita [Sat, 2 Aug 2025 09:30:00 +0000 (18:30 +0900)]
Doc: clarify the restrictions of AFTER triggers with transition tables.
It was not very clear that the triggers are only allowed on plain tables
(not foreign tables). Also, rephrase the documentation for better
readability.
Michael Paquier [Sat, 2 Aug 2025 08:08:48 +0000 (17:08 +0900)]
Fix use-after-free with INSERT ON CONFLICT changes in reorderbuffer.c
In ReorderBufferProcessTXN(), used to send the data of a transaction to
an output plugin, INSERT ON CONFLICT changes (INTERNAL_SPEC_INSERT) are
delayed until a confirmation record arrives (INTERNAL_SPEC_CONFIRM),
updating the change being processed.
8c58624df462 has added an extra step after processing a change to update
the progress of the transaction, by calling the callback
update_progress_txn() based on the LSN stored in a change after a
threshold of CHANGES_THRESHOLD (100) is reached. This logic has missed
the fact that for an INSERT ON CONFLICT change the data is freed once
processed, hence update_progress_txn() could be called pointing to a LSN
value that's already been freed. This could result in random crashes,
depending on the workload.
Per discussion, this issue is fixed by reusing in update_progress_txn()
the LSN from the change processed found at the beginning of the loop,
meaning that for a INTERNAL_SPEC_CONFIRM change the progress is updated
using the LSN of the INTERNAL_SPEC_CONFIRM change, and not the LSN from
its INTERNAL_SPEC_INSERT change. This is actually more correct, as we
want to update the progress to point to the INTERNAL_SPEC_CONFIRM
change.
Masahiko Sawada has found a nice trick to reproduce the issue: hardcode
CHANGES_THRESHOLD at 1 and run test_decoding (test "ddl" being enough)
on an instance running valgrind. The bug has been analyzed by Ethan
Mertz, who also originally suggested the solution used in this patch.
Issue introduced by 8c58624df462, so backpatch down to v16.
Author: Ethan Mertz <ethan.mertz@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/aIsQqDZ7x4LAQ6u1@paquier.xyz
Backpatch-through: 16
Nathan Bossart [Fri, 1 Aug 2025 21:52:11 +0000 (16:52 -0500)]
Allow resetting unknown custom GUCs with reserved prefixes.
Currently, ALTER DATABASE/ROLE/SYSTEM RESET [ALL] with an unknown
custom GUC with a prefix reserved by MarkGUCPrefixReserved() errors
(unless a superuser runs a RESET ALL variant). This is problematic
for cases such as an extension library upgrade that removes a GUC.
To fix, simply make sure the relevant code paths explicitly allow
it. Note that we require superuser or privileges on the parameter
to reset it. This is perhaps a bit more restrictive than is
necessary, but it's not clear whether further relaxing the
requirements is safe.
Oversight in commit 88103567cb. The ALTER SYSTEM fix is dependent
on commit 2d870b4aef, which first appeared in v17. Unfortunately,
back-patching that commit would introduce ABI breakage, and while
that breakage seems unlikely to bother anyone, it doesn't seem
worth the risk. Hence, the ALTER SYSTEM part of this commit is
omitted on v15 and v16.
Reported-by: Mert Alev <mert@futo.org> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/18964-ba09dea8c98fccd6%40postgresql.org
Backpatch-through: 15
libpq: Complain about missing BackendKeyData later with PGgetCancel()
PostgreSQL always sends the BackendKeyData message at connection
startup, but there are some third party backend implementations out
there that don't support cancellation, and don't send the message
[1]. While the protocol docs left it up for interpretation if that is
valid behavior, libpq in PostgreSQL 17 and below accepted it. It does
not seem like the libpq behavior was intentional though, since it did
so by sending CancelRequest messages with all zeros to such servers
(instead of returning an error or making the cancel a no-op).
In version 18 the behavior was changed to return an error when trying
to create the cancel object with PGgetCancel() or PGcancelCreate().
This was done without any discussion, as part of supporting different
lengths of cancel packets for the new 3.2 version of the protocol.
This commit changes the behavior of PGgetCancel() / PGcancel() once
more to only return an error when the cancel object is actually used
to send a cancellation, instead of when merely creating the object.
The reason to do so is that some clients [2] create a cancel object as
part of their connection creation logic (thus having the cancel object
ready for later when they need it), so if creating the cancel object
returns an error, the whole connection attempt fails. By delaying the
error, such clients will still be able to connect to the third party
backend implementations in question, but when actually trying to
cancel a query, the user will be notified that that is not possible
for the server that they are connected to.
This commit only changes the behavior of the older PGgetCancel() /
PQcancel() functions, not the more modern PQcancelCreate() family of
functions. I.e. PQcancelCreate() returns a failed connection object
(CONNECTION_BAD) if the server didn't send a cancellation key. Unlike
the old PQgetCancel() function, we're not aware of any clients in the
field that use PQcancelCreate() during connection startup in a way
that would prevent connecting to such servers.
[1] AWS RDS Proxy is definitely one of them, and CockroachDB might be
another.
Amit Kapila [Fri, 1 Aug 2025 07:46:22 +0000 (07:46 +0000)]
Fix a deadlock during ALTER SUBSCRIPTION ... DROP PUBLICATION.
A deadlock can occur when the DDL command and the apply worker acquire
catalog locks in different orders while dropping replication origins.
The issue is rare in PG16 and higher branches because, in most cases, the
tablesync worker performs the origin drop in those branches, and its
locking sequence does not conflict with DDL operations.
This patch ensures consistent lock acquisition to prevent such deadlocks.
As per buildfarm.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Ajin Cherian <itsajin@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 14, where it was introduced
Discussion: https://postgr.es/m/bab95e12-6cc5-4ebb-80a8-3e41956aa297@gmail.com
Tomas Vondra [Thu, 31 Jul 2025 13:17:26 +0000 (15:17 +0200)]
Fix tab completion for ALTER ROLE|USER ... RESET
Commit c407d5426b87 added tab completion for ALTER ROLE|USER ... RESET,
with the intent to offer only the variables actually set on the role.
But as soon as the user started typing something, it would start to
offer all possible matching variables.
Fix this the same way ALTER DATABASE ... RESET does it, i.e. by
properly considering the prefix.
A second issue causing similar symptoms (offering variables that are not
actually set for a role) was caused by a match to another pattern. The
ALTER DATABASE ... RESET was already excluded, so do the same thing for
ROLE/USER.
Report and fix by Dagfinn Ilmari Mannsåker. Backpatch to 18, same as c407d5426b87.
Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/87qzyghw2x.fsf%40wibble.ilmari.org
Discussion: https://postgr.es/m/87tt4lumqz.fsf%40wibble.ilmari.org
Backpatch-through: 18
Tomas Vondra [Thu, 31 Jul 2025 13:15:44 +0000 (15:15 +0200)]
Schema-qualify unnest() in ALTER DATABASE ... RESET
Commit 9df8727c5067 failed to schema-quality the unnest() call in the
query used to list the variables in ALTER DATABASE ... RESET. If there's
another unnest() function in the search_path, this could cause either
failures, or even security issues (when the tab-completion gets used by
privileged accounts).
Report and fix by Dagfinn Ilmari Mannsåker. Backpatch to 18, same as 9df8727c5067.
Author: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/87qzyghw2x.fsf%40wibble.ilmari.org
Discussion: https://postgr.es/m/87tt4lumqz.fsf%40wibble.ilmari.org
Backpatch-through: 18
Sort dump objects independent of OIDs, for the 7 holdout object types.
pg_dump sorts objects by their logical names, e.g. (nspname, relname,
tgname), before dependency-driven reordering. That removes one source
of logically-identical databases differing in their schema-only dumps.
In other words, it helps with schema diffing. The logical name sort
ignored essential sort keys for constraints, operators, PUBLICATION
... FOR TABLE, PUBLICATION ... FOR TABLES IN SCHEMA, operator classes,
and operator families. pg_dump's sort then depended on object OID,
yielding spurious schema diffs. After this change, OIDs affect dump
order only in the event of catalog corruption. While pg_dump also
wrongly ignored pg_collation.collencoding, CREATE COLLATION restrictions
have been keeping that imperceptible in practical use.
Use techniques like we use for object types already having full sort key
coverage. Where the pertinent queries weren't fetching the ignored sort
keys, this adds columns to those queries and stores those keys in memory
for the long term.
The ignorance of sort keys became more problematic when commit 172259afb563d35001410dc6daad78b250924038 added a schema diff test
sensitive to it. Buildfarm member hippopotamus witnessed that.
However, dump order stability isn't a new goal, and this might avoid
other dump comparison failures. Hence, back-patch to v13 (all supported
versions).
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/20250707192654.9e.nmisch@google.com
Backpatch-through: 13
doc: Adjust documentation for vacuumdb --missing-stats-only.
The sentence in question gave readers the impression that vacuumdb
removes statistics for a period of time while analyzing, but it's
actually meant to convey that --analyze-in-stages temporarily
replaces existing statistics with ones generated with lower
statistics targets.
Reported-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Reviewed-by: Frédéric Yhuel <frederic.yhuel@dalibo.com> Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://postgr.es/m/4b94ca16-7a6d-4581-b2aa-4ea79dbc082a%40dalibo.com
Backpatch-through: 18
Andrew Dunstan [Wed, 30 Jul 2025 15:04:05 +0000 (11:04 -0400)]
Revert Non text modes for pg_dumpall, and pg_restore support
Recent discussions of the mechanisms used to manage global data have
raised concerns about their robustness and security. Rather than try
to deal with those concerns at a very late stage of the release cycle,
the conclusion is to revert these features and work on them for the
next release.
This reverts parts or all of the following commits:
1495eff7bdb Non text modes for pg_dumpall, correspondingly change pg_restore 5db3bf7391d Clean up from commit 1495eff7bdb 289f74d0cb2 Add more TAP tests for pg_dumpall 2ef57908067 Fix a couple of error messages and tests for them b52a4a5f285 Clean up error messages from 1495eff7bdb 4170298b6ec Further cleanup for directory creation on pg_dump/pg_dumpall 22cb6d28950 Fix memory leak in pg_restore.c 928394b664b Improve various new-to-v18 appendStringInfo calls 39729ec01d2 Fix fat fingering in 22cb6d28950 5822bf21d50 Add missing space in pg_restore documentation. f09088a01d3 Free memory properly in pg_restore.c 40b9c27014d pg_restore cleanups 4aad2cb7707 Portability fix: isdigit() must be passed an unsigned char. 88e947136b4 Fix typos and grammar in the code f60420cff66 doc: Alphabetize long options for pg_dump[all]. bc35adee8d7 doc: Put new options in consistent order on man pages a876464abc7 Message style improvements dec6643487b Improve pg_dump/pg_dumpall help synopses and terminology 0ebd2425558 Run pgperltidy
Michael Paquier [Wed, 30 Jul 2025 02:55:46 +0000 (11:55 +0900)]
Fix ./configure checks with __cpuidex() and __cpuid()
The configure checks used two incorrect functions when checking the
presence of some routines in an environment:
- __get_cpuidex() for the check of __cpuidex().
- __get_cpuid() for the check of __cpuid().
This means that Postgres has never been able to detect the presence of
these functions, impacting environments where these exist, like Windows.
Simply fixing the function name does not work. For example, using
configure with MinGW on Windows causes the checks to detect all four of
__get_cpuid(), __get_cpuid_count(), __cpuidex() and __cpuid() to be
available, causing a compilation failure as this messes up with the
MinGW headers as we would include both <intrin.h> and <cpuid.h>.
The Postgres code expects only one in { __get_cpuid() , __cpuid() } and
one in { __get_cpuid_count() , __cpuidex() } to exist. This commit
reshapes the configure checks to do exactly what meson is doing, which
has been working well for us: check one, then the other, but never allow
both to be detected in a given build.
The logic is wrong since 3dc2d62d0486 and 792752af4eb5 where these
checks have been introduced (the second case is most likely a copy-pasto
coming from the first case), with meson documenting that the configure
checks were broken. As far as I can see, they are not once applied
consistently with what the code expects, but let's see if the buildfarm
has different something to say. The comment in meson.build is adjusted
as well, to reflect the new reality.
If the client sent a query cancel request with backend PID 0, it
tripped an assertion. With assertions disabled, you got this in the
log instead:
LOG: invalid cancel request with PID 0
LOG: wrong key in cancel request for process 0
Query cancellations don't even require authentication, so we better
tolerate bogus requests. Fix by turning the assertion into a regular
runtime check.
Spotted while testing libpq behavior with a modified server that
didn't send BackendKeyData to the client.
Tom Lane [Tue, 29 Jul 2025 19:17:40 +0000 (15:17 -0400)]
Don't put library-supplied -L/-I switches before user-supplied ones.
For many optional libraries, we extract the -L and -l switches needed
to link the library from a helper program such as llvm-config. In
some cases we put the resulting -L switches into LDFLAGS ahead of
-L switches specified via --with-libraries. That risks breaking
the user's intention for --with-libraries.
It's not such a problem if the library's -L switch points to a
directory containing only that library, but on some platforms a
library helper may "helpfully" offer a switch such as -L/usr/lib
that points to a directory holding all standard libraries. If the
user specified --with-libraries in hopes of overriding the standard
build of some library, the -L/usr/lib switch prevents that from
happening since it will come before the user-specified directory.
To fix, avoid inserting these switches directly into LDFLAGS during
configure, instead adding them to LIBDIRS or SHLIB_LINK. They will
still eventually get added to LDFLAGS, but only after the switches
coming from --with-libraries.
The same problem exists for -I switches: those coming from
--with-includes should appear before any coming from helper programs
such as llvm-config. We have not heard field complaints about this
case, but it seems certain that a user attempting to override a
standard library could have issues.
The changes for this go well beyond configure itself, however,
because many Makefiles have occasion to manipulate CPPFLAGS to
insert locally-desirable -I switches, and some of them got it wrong.
The correct ordering is any -I switches pointing at within-the-
source-tree-or-build-tree directories, then those from the tree-wide
CPPFLAGS, then those from helper programs. There were several places
that risked pulling in a system-supplied copy of libpq headers, for
example, instead of the in-tree files. (Commit cb36f8ec2 fixed one
instance of that a few months ago, but this exercise found more.)
The Meson build scripts may or may not have any comparable problems,
but I'll leave it to someone else to investigate that.
Reported-by: Charles Samborski <demurgos@demurgos.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/70f2155f-27ca-4534-b33d-7750e20633d7@demurgos.net
Backpatch-through: 13
Tom Lane [Tue, 29 Jul 2025 16:47:19 +0000 (12:47 -0400)]
Remove unnecessary complication around xmlParseBalancedChunkMemory.
When I prepared 71c0921b6 et al yesterday, I was thinking that the
logic involving explicitly freeing the node_list output was still
needed to dodge leakage bugs in libxml2. But I was misremembering:
we introduced that only because with early 2.13.x releases we could
not trust xmlParseBalancedChunkMemory's result code, so we had to
look to see if a node list was returned or not. There's no reason
to believe that xmlParseBalancedChunkMemory will fail to clean up
the node list when required, so simplify. (This essentially
completes reverting all the non-cosmetic changes in 6082b3d5d.)
Reported-by: Jim Jones <jim.jones@uni-muenster.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/997668.1753802857@sss.pgh.pa.us
Backpatch-through: 13
This commit documents differences in the definition of word separators for
the initcap function between libc and ICU locale providers.
Backpatch to all supported branches.
Tom Lane [Mon, 28 Jul 2025 20:50:41 +0000 (16:50 -0400)]
Avoid regression in the size of XML input that we will accept.
This mostly reverts commit 6082b3d5d, "Use xmlParseInNodeContext
not xmlParseBalancedChunkMemory". It turns out that
xmlParseInNodeContext will reject text chunks exceeding 10MB, while
(in most libxml2 versions) xmlParseBalancedChunkMemory will not.
The bleeding-edge libxml2 bug that we needed to work around a year
ago is presumably no longer a factor, and the argument that
xmlParseBalancedChunkMemory is semi-deprecated is not enough to
justify a functionality regression. Hence, go back to doing it
the old way.
Reported-by: Michael Paquier <michael@paquier.xyz>
Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Erik Wienhold <ewie@ewie.name> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aIGknLuc8b8ega2X@paquier.xyz
Backpatch-through: 13
As a result, an attempt to set synchronous_standby_names to any value
that does not parse resulted in a generic "parser failed." message
rather than anything more specific. This fixes that.
Michael Paquier [Sun, 27 Jul 2025 23:15:16 +0000 (08:15 +0900)]
Fix performance regression with flush of pending fixed-numbered stats
The callback added in fc415edf8ca8 used to check if there is any pending
data to flush for fixed-numbered statistics, done by looping across all
the builtin and custom stats kinds with a call to have_fixed_pending_cb,
is proving to able to show in workloads that do not report any stats
(read-only, no function calls, no WAL, no IO, etc). The code used in
v17 was cheaper than that what HEAD has introduced, relying on three
boolean checks for WAL, SLRU and IO stats.
This commit switches the code to use a more efficient approach than fc415edf8ca8, with a single boolean flag that can be switched to "true"
by any fixed-numbered stats kinds to force pgstat_report_stat() to go
through one round of reports. The flag is reset by pgstat_report_stat()
once a full round of reports is done. The flag being false means that
fixed-numbered stats kinds saw no activity, and that there is no pending
data to flush.
ac000fca743e took one step in improving the performance by reducing the
number of stats kinds that the backend can hold. This commit takes a
more drastic step by bringing back the code efficiency to what it was
before v18 with a cheap check at the beginning of pgstat_report_stat()
for its fast-exit path.
The callback have_static_pending_cb is removed as an effect of all that.
If the number of sync requests is big enough, the palloc() call in
AbsorbSyncRequests() will attempt to allocate more than 1 GB of memory,
resulting in failure. This can lead to an infinite loop in the checkpointer
process, as it repeatedly fails to absorb the pending requests.
This commit limits the checkpointer requests queue size to 10M items. In
addition to preventing the palloc() failure, this change helps to avoid long
queue processing time.
Also, this commit is for backpathing only. The master branch receives
a more invasive yet comprehensive fix for this problem.
Fix background worker not restarting after crash-and-restart cycle.
Previously, if a background worker crashed (e.g., due to a SIGKILL) and
the server restarted due to restart_after_crash being enabled,
the worker was not restarted as expected. Background workers without
the never-restart flag should automatically restart in this case.
This issue was introduced in commit 28a520c0b77, which failed to reset
the rw_pid field in the RegisteredBgWorker struct for the crashed worker.
This commit fixes the problem by resetting rw_pid for all eligible
background workers during the crash-and-restart cycle.
Back-patched to v18, where the bug was introduced.
Bug fix patches were proposed by Andrey Rudometov and ChangAo Chen,
but this commit uses a different approach.
Michael Paquier [Fri, 25 Jul 2025 07:17:31 +0000 (16:17 +0900)]
Fix assertion failure with latch wait in single-user mode
LatchWaitSetPostmasterDeathPos, the latch event position for the
postmaster death event, is initialized under IsUnderPostmaster.
WaitLatch() considered it as a valid wait target in single-user mode
(!IsUnderPostmaster), which was incorrect.
One code path found to fail with an assertion failure is a database drop
in single-user mode while waiting in WaitForProcSignalBarrier() after
the drop.
Michael Paquier [Fri, 25 Jul 2025 02:17:51 +0000 (11:17 +0900)]
Lower bounds related to pgstats kinds
This commit changes stats kinds to have the following bounds, making
their handling in core cheaper by default:
- PGSTAT_KIND_CUSTOM_MIN 128 -> 24
- PGSTAT_KIND_MAX 256 -> 32
The original numbers were rather high, and showed an impact on
performance in pgstat_report_stat() for the case of simple queries with
its early-exit path if there are no pending statistics to flush. This
logic will be improved more in a follow-up commit to bring the
performance of pgstat_report_stat() on par with v17 and older versions.
Lowering the bounds is a change worth doing on its own, independently of
the other improvement.
These new numbers should be enough to leave some room for the following
years for built-in and custom stats kinds, with stable ID numbers. At
least that should be enough to start with this facility for extension
developers. It can be always increased in the tree depending on the
requirements wanted.
Per discussion with Andres Freund and Bertrand Drouvot.
Amit Kapila [Thu, 24 Jul 2025 08:50:40 +0000 (08:50 +0000)]
Fix duplicate transaction replay during pg_createsubscriber.
Previously, the tool could replay the same transaction twice, once during
recovery, then again during replication after the subscriber was set up.
This occurred because the same recovery_target_lsn was used both to
finalize recovery and to start replication. If
recovery_target_inclusive = true, the transaction at that LSN would be
applied during recovery and then sent again by the publisher leading to
duplication.
To prevent this, we now set recovery_target_inclusive = false. This
ensures the transaction at recovery_target_lsn is not reapplied during
recovery, avoiding duplication when replication begins.
Bug #18897 Reported-by: Zane Duffield <duffieldzane@gmail.com>
Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: vignesh C <vignesh21@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/18897-d3db67535860dddb@postgresql.org
Tom Lane [Wed, 23 Jul 2025 19:44:29 +0000 (15:44 -0400)]
Fix build breakage on Solaris-alikes with late-model GCC.
Solaris has never bothered to add "const" to the second argument
of PAM conversation procs, as all other Unixen did decades ago.
This resulted in an "incompatible pointer" compiler warning when
building --with-pam, but had no more serious effect than that,
so we never did anything about it. However, as of GCC 14 the
case is an error not warning by default.
To complicate matters, recent OpenIndiana (and maybe illumos
in general?) *does* supply the "const" by default, so we can't
just assume that platforms using our solaris template need help.
What we can do, short of building a configure-time probe,
is to make solaris.h #define _PAM_LEGACY_NONCONST, which
causes OpenIndiana's pam_appl.h to revert to the traditional
definition, and hopefully will have no effect anywhere else.
Then we can use that same symbol to control whether we include
"const" in the declaration of pam_passwd_conv_proc().
Bug: #18995 Reported-by: Andrew Watkins <awatkins1966@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18995-82058da9ab4337a7@postgresql.org
Backpatch-through: 13
Andres Freund [Tue, 22 Jul 2025 12:30:52 +0000 (08:30 -0400)]
aio: Fix assertion, clarify README
The assertion wouldn't have triggered for a long while yet, but this won't
accidentally fail to detect the issue if/when it occurs.
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/CAEze2Wj-43JV4YufW23gm=Uwr7Lkj+p0yKctKHxNm1rwFC+_DQ@mail.gmail.com
Backpatch-through: 18
Amit Kapila [Tue, 22 Jul 2025 05:56:22 +0000 (05:56 +0000)]
Doc: Fix logical replication examples.
The definition of \dRp+ was modified in commit 7054186c4e. This patch
updates the column list and row filter examples to align with the revised
definition.
Author: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed by: Peter Smith <smithpb2250@gmail.com>
Backpatch-through: 18, where it was introduced
Discussion: https://postgr.es/m/CANhcyEUvqkSO6b9zi_fs_BBPEge5acj4mf8QKmq2TX-7axa7EQ@mail.gmail.com
Michael Paquier [Tue, 22 Jul 2025 05:34:19 +0000 (14:34 +0900)]
doc: Inform about aminsertcleanup optional NULLness
This index AM callback has been introduced in c1ec02be1d79 and it is
optional, currently only being used by BRIN. Optional callbacks are
documented with NULL as possible value in amapi.h and indexam.sgml, but
this callback has missed this part of the description.
Reported-by: Peter Smith <smithpb2250@gmail.com> Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/CAHut+PvgYcPmPDi1YdHMJY5upnyGRpc0N8pk1xNB11xDSBwNog@mail.gmail.com
Backpatch-through: 17
Michael Paquier [Tue, 22 Jul 2025 05:00:04 +0000 (14:00 +0900)]
ecpg: Fix NULL pointer dereference during connection lookup
ECPGconnect() caches established connections to the server, supporting
the case of a NULL connection name when a database name is not specified
by its caller.
A follow-up call to ECPGget_PGconn() to get an established connection
from the cached set with a non-NULL name could cause a NULL pointer
dereference if a NULL connection was listed in the cache and checked for
a match. At least two connections are necessary to reproduce the issue:
one with a NULL name and one with a non-NULL name.
Author: Aleksander Alekseev <aleksander@tigerdata.com>
Discussion: https://postgr.es/m/CAJ7c6TNvFTPUTZQuNAoqgzaSGz-iM4XR61D7vEj5PsQXwg2RyA@mail.gmail.com
Backpatch-through: 13
pg_dump: include comments on not-null constraints on domains, too
Commit e5da0fe3c22b introduced catalog entries for not-null constraints
on domains; but because commit b0e96f311985 (the original work for
catalogued not-null constraints on tables) forgot to teach pg_dump to
process the comments for them, this one also forgot. Add that now.
We also need to teach repairDependencyLoop() about the new type of
constraints being possible for domains.
Backpatch-through: 17 Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF-0bqVR=j4jonS6N2Ka6hHUpFyu3_3TWKNhOW_4yFSSg@mail.gmail.com
doc: Document reopen of output file via SIGHUP in pg_recvlogical.
When pg_recvlogical receives a SIGHUP signal, it closes the current
output file and reopens a new one. This is useful since it allows us to
rotate the output file by renaming the current file and sending a SIGHUP.
This behavior was previously undocumented. This commit adds
the missing documentation.
This commit is only for HEAD and v18, where the test has been removed.
It also incorporates improvements below to stability and coverage of the
original test, which were already backpatched to v17.
- Add one pg_logical_emit_message() call to force the creation of a record
that spawns across two pages.
- Make the logic wait for the checkpoint completion.
Author: Alexander Korotkov <akorotkov@postgresql.org> Co-authored-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Michael Paquier <michael@paquier.xyz>
Backpatch-through: 18
Improve the stability of the recovery test 047_checkpoint_physical_slot
Currently, the comments in 047_checkpoint_physical_slot. It shows an
incomplete intention to wait for checkpoint completion before performing
an immediate database stop. However, an immediate node stop can occur both
before and after checkpoint completion. Both cases should work correctly.
But we would like the test to be more stable and deterministic. This is why
this commit makes this test explicitly wait for the checkpoint completion
log message.
Discussion: https://postgr.es/m/CAPpHfdurV-j_e0pb%3DUFENAy3tyzxfF%2ByHveNDNQk2gM82WBU5A%40mail.gmail.com
Discussion: https://postgr.es/m/aHXLep3OaX_vRTNQ%40paquier.xyz
Author: Alexander Korotkov <akorotkov@postgresql.org> Reviewed-by: Michael Paquier <michael@paquier.xyz>
Backpatch-through: 17
Fix infinite wait when reading a partially written WAL record
If a crash occurs while writing a WAL record that spans multiple pages, the
recovery process marks the page with the XLP_FIRST_IS_OVERWRITE_CONTRECORD
flag. However, logical decoding currently attempts to read the full WAL
record based on its expected size before checking this flag, which can lead
to an infinite wait if the remaining data is never written (e.g., no activity
after crash).
This patch updates the logic first to read the page header and check for
the XLP_FIRST_IS_OVERWRITE_CONTRECORD flag before attempting to reconstruct
the full WAL record. If the flag is set, decoding correctly identifies
the record as incomplete and avoids waiting for WAL data that will never
arrive.
Discussion: https://postgr.es/m/CAAKRu_ZCOzQpEumLFgG_%2Biw3FTa%2BhJ4SRpxzaQBYxxM_ZAzWcA%40mail.gmail.com
Discussion: https://postgr.es/m/CALDaNm34m36PDHzsU_GdcNXU0gLTfFY5rzh9GSQv%3Dw6B%2BQVNRQ%40mail.gmail.com
Author: Vignesh C <vignesh21@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 13
Dean Rasheed [Fri, 18 Jul 2025 08:59:40 +0000 (09:59 +0100)]
Fix concurrent update trigger issues with MERGE in a CTE.
If a MERGE inside a CTE attempts an UPDATE or DELETE on a table with
BEFORE ROW triggers, and a concurrent UPDATE or DELETE happens, the
merge code would fail (crashing in the case of an UPDATE action, and
potentially executing the wrong action for a DELETE action).
This is the same issue that 9321c79c86 attempted to fix, except now
for a MERGE inside a CTE. As noted in 9321c79c86, what needs to happen
is for the trigger code to exit early, returning the TM_Result and
TM_FailureData information to the merge code, if a concurrent
modification is detected, rather than attempting to do an EPQ
recheck. The merge code will then do its own rechecking, and rescan
the action list, potentially executing a different action in light of
the concurrent update. In particular, the trigger code must never call
ExecGetUpdateNewTuple() for MERGE, since that is bound to fail because
MERGE has its own per-action projection information.
Commit 9321c79c86 did this using estate->es_plannedstmt->commandType
in the trigger code to detect that a MERGE was being executed, which
is fine for a plain MERGE command, but does not work for a MERGE
inside a CTE. Fix by passing that information to the trigger code as
an additional parameter passed to ExecBRUpdateTriggers() and
ExecBRDeleteTriggers().
Back-patch as far as v17 only, since MERGE cannot appear inside a CTE
prior to that. Additionally, take care to preserve the trigger ABI in
v17 (though not in v18, which is still in beta).
Bug: #18986 Reported-by: Yaroslav Syrytsia <me@ys.lc>
Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18986-e7a8aac3d339fa47@postgresql.org
Backpatch-through: 17
Tom Lane [Thu, 17 Jul 2025 16:46:38 +0000 (12:46 -0400)]
Fix PQport to never return NULL unless the connection is NULL.
This is the documented behavior, and it worked that way before
v10. However, addition of the connhost[] array created cases
where conn->connhost[conn->whichhost].port is NULL. The rest
of libpq is careful to substitute DEF_PGPORT[_STR] for a null
or empty port string, but we failed to do so here, leading to
possibly returning NULL. As of v18 that causes psql's \conninfo
command to segfault. Older psql versions avoid that, but it's
pretty likely that other clients have trouble with this,
so we'd better back-patch the fix.
In stable branches, just revert to our historical behavior of
returning an empty string when there was no user-given port
specification. However, it seems substantially more useful and
indeed more correct to hand back DEF_PGPORT_STR in such cases,
so let's make v18 and master do that.
Author: Daniele Varrazzo <daniele.varrazzo@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+mi_8YTS8WPZPO0PAb2aaGLwHuQ0DEQRF0ZMnvWss4y9FwDYQ@mail.gmail.com
Backpatch-through: 13
We have an assertion to ensure that a command tag has been assigned by
the time we're done executing, but if we happen to execute a command
with no queries, the assertion would fail. Per discussion, rather than
contort things to get a tag assigned, just remove the assertion.
Oversight in 2f9661311b83. That commit also retained a comment that
explained logic that had been adjacent to it but diffused into various
places, leaving none apt to keep part of the comment. Remove that part,
and rewrite what remains for extra clarity.
Bug: #18984
Backpatch-through: 13 Reported-by: Aleksander Alekseev <aleksander@tigerdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Michaël Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/18984-0f4778a6599ac3ae@postgresql.org
This commit adds a note to the pg_overexplain page that describes
how to use it (LOAD, session_preload_libraries, or
shared_preload_libraries). The new text is mostly lifted from the
auto_explain page. We should probably consider centralizing this
information in the future.
While at it, add a missing "module" to the opening sentence.
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/aHVWKM8l8kLlZzgv%40nathan
Backpatch-through: 18
Michael Paquier [Thu, 17 Jul 2025 00:32:49 +0000 (09:32 +0900)]
Fix inconsistent LWLock tranche names for MultiXact*
The terms used in wait_event_names.txt and lwlock.c were inconsistent
for MultiXactOffsetSLRU and MultiXactMemberSLRU, which could cause joins
between pg_wait_events and pg_stat_activity to fail. lwlock.c is
adjusted in this commit to what the historical name of the event has
always been, and what is documented.
The paragraph for introducing INSERT and COPY discussed how a file
could be used for bulk loading with COPY, without actually showing
what the file would look like. This adds a programlisting for the
file contents.
Backpatch to all supported branches since this example has lacked
the file contents since PostgreSQL 7.2.
Author: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/158017814191.19852.15019251381150731439@wrigleys.postgresql.org
Backpatch-through: 13
Fix dumping of comments on invalid constraints on domains
We skip dumping constraints together with domains if they are invalid
('separate') so that they appear after data -- but their comments were
dumped together with the domain definition, which in effect leads to the
comment being dumped when the constraint does not yet exist. Delay
them in the same way.
Oversight in 7eca575d1c28; backpatch all the way back.
Author: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxF_C2pe6J_+nPr6C5jf5rQnbYP8XOKr4HM8yHZtp2aQqQ@mail.gmail.com
Jeff Davis [Wed, 16 Jul 2025 16:57:07 +0000 (09:57 -0700)]
pg_dumpall: Skip global objects with --statistics-only or --no-schema.
Previously, pg_dumpall would still dump global objects such as roles
and tablespaces even when --statistics-only or --no-schema was specified.
Since these global objects are treated as schema-level data, they should
be skipped in these cases.
This commit fixes the issue by ensuring that global objects are not
dumped when either --statistics-only or --no-schema is used.
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/aHVo791guQR6uqwT%40nathan
Backpatch-through: 13
doc: Fix confusing description of streaming option in START_REPLICATION.
Previously, the documentation described the streaming option as a boolean,
which is outdated since it's no longer a boolean as of protocol version 4.
This could confuse users.
This commit updates the description to remove the "boolean" reference and
clearly list the valid values for the streaming option.
Back-patch to v16, where the streaming option changed to a non-boolean.
doc: Clarify that total_vacuum_time excludes VACUUM FULL.
The last_vacuum and vacuum_count fields in pg_stat_all_tables already
state that they do not include VACUUM FULL. However, total_vacuum_time,
which also excludes VACUUM FULL, did not mention this. This could
mislead users into thinking VACUUM FULL time is included.
To address this, this commit updates the documentation for
pg_stat_all_tables to explicitly state that total_vacuum_time does not
count VACUUM FULL.
Back-patched to v18, where total_vacuum_time was introduced.
Additionally, this commit clarifies that n_ins_since_vacuum also
excludes VACUUM FULL. Although n_ins_since_vacuum was added in v13,
we are not back-patching this change to stable branches, as it is
a documentation improvement, not a bug fix.
Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/2ac375d1-591b-4f1b-a2af-f24335567866@oss.nttdata.com
Backpatch-through: 18
Tom Lane [Tue, 15 Jul 2025 22:53:00 +0000 (18:53 -0400)]
Doc: clarify description of regexp fields in pg_ident.conf.
The grammar was a little shaky and confusing here, so word-smith it
a bit. Also, adjust the comments in pg_ident.conf.sample to use the
same terminology as the SGML docs, in particular "DATABASE-USERNAME"
not "PG-USERNAME".
Back-patch appropriate subsets. I did not risk changing
pg_ident.conf.sample in released branches, but it still seems OK
to change it in v18.
Reported-by: Alexey Shishkin <alexey.shishkin@enterprisedb.com>
Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/175206279327.3157504.12519088928605422253@wrigleys.postgresql.org
Backpatch-through: 13
Tom Lane [Tue, 15 Jul 2025 22:11:18 +0000 (18:11 -0400)]
Silence uninitialized-value warnings in compareJsonbContainers().
Because not every path through JsonbIteratorNext() sets val->type,
some compilers complain that compareJsonbContainers() is comparing
possibly-uninitialized values. The paths that don't set it return
WJB_DONE, WJB_END_ARRAY, or WJB_END_OBJECT, so it's clear by
manual inspection that the "(ra == rb)" code path is safe, and
indeed we aren't seeing warnings about that. But the (ra != rb)
case is much less obviously safe. In Assert-enabled builds it
seems that the asserts rejecting WJB_END_ARRAY and WJB_END_OBJECT
persuade gcc 15.x not to warn, which makes little sense because
it's impossible to believe that the compiler can prove of its
own accord that ra/rb aren't WJB_DONE here. (In fact they never
will be, so the code isn't wrong, but why is there no warning?)
Without Asserts, the appearance of warnings is quite unsurprising.
We discussed fixing this by converting those two Asserts into
pg_assume, but that seems not very satisfactory when it's so unclear
why the compiler is or isn't warning: the warning could easily
reappear with some other compiler version. Let's fix it in a less
magical, more future-proof way by changing JsonbIteratorNext()
so that it always does set val->type. The cost of that should be
pretty negligible, and it makes the function's API spec less squishy.
Reported-by: Erik Rijkers <er@xs4all.nl>
Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/988bf1bc-3f1f-99f3-bf98-222f1cd9dc5e@xs4all.nl
Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru
Backpatch-through: 13
Tom Lane [Tue, 15 Jul 2025 20:35:42 +0000 (16:35 -0400)]
Doc: clarify description of current-date/time functions.
Minor wordsmithing of the func.sgml paragraph describing
statement_timestamp() and allied functions: don't switch between
"statement" and "command" when those are being used to mean about
the same thing.
Also, add some text to protocol.sgml describing the perhaps-surprising
behavior these functions have in a multi-statement Query message.
Reported-by: P M <petermittere@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Discussion: https://postgr.es/m/175223006802.3157505.14764328206246105568@wrigleys.postgresql.org
Backpatch-through: 13
Thomas Munro [Sat, 12 Jul 2025 04:20:11 +0000 (16:20 +1200)]
aio: Fix configuration reload in IO workers.
method_worker.c installed SignalHandlerForConfigReload, but it failed to
actually process reload requests. That hasn't yet produced any concrete
problem reports in terms of GUC changes it should have cared about in
v18, but it was inconsistent.
It did cause problems for a couple of patches in development that need
IO workers to react to ALTER SYSTEM + pg_reload_conf(). Fix extracted
from one of those patches.
Thomas Munro [Sat, 12 Jul 2025 01:47:59 +0000 (13:47 +1200)]
aio: Remove obsolete IO worker ID references.
In an ancient ancestor of this code, the postmaster assigned IDs to IO
workers. Now it tracks them in an unordered array and doesn't know
their IDs, so it might be confusing to readers that it still referred to
their indexes as IDs.
No change in behavior, just variable name and error message cleanup.
Thomas Munro [Sat, 12 Jul 2025 01:43:27 +0000 (13:43 +1200)]
aio: Regularize IO worker internal naming.
Adopt PgAioXXX convention for pgaio module type names. Rename a
function that didn't use a pgaio_worker_ submodule prefix. Rename the
internal submit function's arguments to match the indirectly relevant
function pointer declaration and nearby examples. Rename the array of
handle IDs in PgAioSubmissionQueue to sqes, a term of art seen in the
systems it emulates, also clarifying that they're not IO handle
pointers as the old name might imply.
No change in behavior, just type, variable and function name cleanup.
Thomas Munro [Fri, 11 Jul 2025 23:18:05 +0000 (11:18 +1200)]
Fix stale idle flag when IO workers exit.
Otherwise we could choose a worker that has exited and crash while
trying to wake it up.
Back-patch to 18.
Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/t5aqjhkj6xdkido535pds7fk5z4finoxra4zypefjqnlieevbg%40357aaf6u525j
Tom Lane [Fri, 11 Jul 2025 22:50:13 +0000 (18:50 -0400)]
Fix inconsistent quoting of role names in ACLs.
getid() and putid(), which parse and deparse role names within ACL
input/output, applied isalnum() to see if a character within a role
name requires quoting. They did this even for non-ASCII characters,
which is problematic because the results would depend on encoding,
locale, and perhaps even platform. So it's possible that putid()
could elect not to quote some string that, later in some other
environment, getid() will decide is not a valid identifier, causing
dump/reload or similar failures.
To fix this in a way that won't risk interoperability problems
with unpatched versions, make getid() treat any non-ASCII as a
legitimate identifier character (hence not requiring quotes),
while making putid() treat any non-ASCII as requiring quoting.
We could remove the resulting excess quoting once we feel that
no unpatched servers remain in the wild, but that'll be years.
A lesser problem is that getid() did the wrong thing with an input
consisting of just two double quotes (""). That has to represent an
empty string, but getid() read it as a single double quote instead.
The case cannot arise in the normal course of events, since we don't
allow empty-string role names. But let's fix it while we're here.
Although we've not heard field reports of problems with non-ASCII
role names, there's clearly a hazard there, so back-patch to all
supported versions.
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3792884.1751492172@sss.pgh.pa.us
Backpatch-through: 13
Jacob Champion [Fri, 11 Jul 2025 17:26:18 +0000 (10:26 -0700)]
oauth: Run Autoconf tests with correct compiler flags
Commit b0635bfda split off the CPPFLAGS/LDFLAGS/LDLIBS for libcurl into
their own separate Makefile variables, but I neglected to move the
existing AC_CHECKs for Curl into a place where they would make use of
those variables. They instead tested the system libcurl, which 1) is
unhelpful if a different Curl is being used for the build and 2) will
fail the build entirely if no system libcurl exists. Correct the order
of operations here.
Reported-by: Ivan Kush <ivan.kush@tantorlabs.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Ivan Kush <ivan.kush@tantorlabs.com>
Discussion: https://postgr.es/m/8a611028-51a1-408c-b592-832e2e6e1fc9%40tantorlabs.com
Backpatch-through: 18
Amit Kapila [Fri, 11 Jul 2025 04:58:29 +0000 (10:28 +0530)]
Fix the handling of two GUCs during upgrade.
Previously, the check_hook functions for max_slot_wal_keep_size and
idle_replication_slot_timeout would incorrectly raise an ERROR for values
set in postgresql.conf during upgrade, even though those values were not
actively used in the upgrade process.
To prevent logical slot invalidation during upgrade, we used to set
special values for these GUCs. Now, instead of relying on those values, we
directly prevent WAL removal and logical slot invalidation caused by
max_slot_wal_keep_size and idle_replication_slot_timeout.
Note: PostgreSQL 17 does not include the idle_replication_slot_timeout
GUC, so related changes were not backported.
BUG #18979 Reported-by: jorsol <jorsol@gmail.com>
Author: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed by: vignesh C <vignesh21@gmail.com>
Reviewed by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/219561.1751826409@sss.pgh.pa.us
Discussion: https://postgr.es/m/18979-a1b7fdbb7cd181c6@postgresql.org
doc: Clarify meaning of "idle" in idle_replication_slot_timeout.
This commit updates the documentation to clarify that "idle" in
idle_replication_slot_timeout means the replication slot is inactive,
that is, not currently used by any replication connection.
Without this clarification, "idle" could be misinterpreted to mean
that the slot is not advancing or that no data is being streamed,
even if a connection exists.
Back-patch to v18 where idle_replication_slot_timeout was added.
Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Gunnar Morling <gunnar.morling@googlemail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com
Backpatch-through: 18
Change unit of idle_replication_slot_timeout to seconds.
Previously, the idle_replication_slot_timeout parameter used minutes
as its unit, based on the assumption that values would typically exceed
one minute in production environments. However, this caused unexpected
behavior: specifying a value below 30 seconds would round down to 0,
effectively disabling the timeout. This could be surprising to users.
To allow finer-grained control and avoid such confusion, this commit changes
the unit of idle_replication_slot_timeout to seconds. Larger values can
still be specified easily using standard time suffixes, for example,
'24h' for 24 hours.
Back-patch to v18 where idle_replication_slot_timeout was added.
Reported-by: Gunnar Morling <gunnar.morling@googlemail.com>
Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com
Backpatch-through: 18