git.ipfire.org Git - thirdparty/postgresql.git/log

On Windows, make link(2) report ENOTSUP when appropriate.

CreateHardLinkA reports ERROR_INVALID_FUNCTION if the target file
is on a filesystem that doesn't support hard links.  _dosmaperr
maps that to EINVAL, which confuses zic.c into failure.  zic.c is
expecting ENOTSUP if the filesystem lacks link support, and will
properly fall back to making a physical copy if it gets that.
Hence, add code to map ERROR_INVALID_FUNCTION to ENOTSUP.

(We could instead teach _dosmaperr to do that, but it's far from clear
that this would be appropriate as a global behavior: intuitively
it seems like EINVAL should be appropriate most of the time.)

We didn't need this before commit aeb07c55f, because the tzcode
version we were using before that didn't have this particular
error-handling logic.  Hence, no back-patch for now; but if we
decide to back-patch tzcode 2026b or later, we'll need this too.

Author: Vladlen Popolitov <v.popolitov@postgrespro.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/e6122f9b2eef9096f1f11ecc058bcd91@postgrespro.ru

libpq-oauth: Avoid overflow for very large intervals

The slow_down interval parsing code checks explicitly for overflow, but
since it does that after the signed overflow has already occurred, we
end up inviting undefined behavior from the compiler anyway.

Use checked arithmetic instead. set_timer() takes a long int in order to
interface nicely with libcurl, so use an int32 as the interval counter
and clamp to LONG_MAX during conversion to milliseconds.

Backpatch to 18, where libpq-oauth was introduced.

Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/qtclihmrkq67ach3xjxyi4qcksstin5qxwsnkqefkmotxwh4g6%40ae2bj6jvcmry
Backpatch-through: 18

Prevent walsummarizer from getting stuck at a timeline switch.

As previously coded, walsummarizer only wants to read WAL from a file
where the TimeLineID in the filename exactly matches the TimeLineID being
summarized. But in some cases, when a timeline switch occurs, the WAL file
from the old timeline is not archived, because it's never completely
filled, so the only way to obtain the contents of that last partial
segment is to read from the first segment on the new timeline. Teach
WAL summarizer to do that, and add a test case to make sure that it
works.

Reported-by: Nick Ivanov <nick.ivanov@enterprisedb.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Tested-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Thom Brown <thom@linux.com>
Discussion: http://postgr.es/m/CA+Tgmobr27GpKDZx3_ezW2+C5_g18i+jSK3sGF_cR-_ESv5N5A@mail.gmail.com
Backpatch-through: 17

Fix autovacuum's database sorting.

When db_comparator() was updated to use pg_cmp_s32(), the arguments
were listed in the wrong order. This caused autovacuum to sort the
databases by their scores in ascending order instead of descending
order. To fix, swap the arguments to pg_cmp_s32().

Oversight in commit 3b42bdb471.

Reported-by: Хамидуллин Рустам <r.khamidullin@postgrespro.ru>
Author: Хамидуллин Рустам <r.khamidullin@postgrespro.ru>
Discussion: https://postgr.es/m/5c5a7984-b149-b505-7ad9-2a7766c65b55%40postgrespro.ru
Backpatch-through: 17

Remove code for pre-v10 servers from AdjustUpgrade.pm.

We recently removed support for upgrading from pre-v10 servers, so
we no longer need to handle older versions in this helper module.

Oversight in commit 14d8418083.

Discussion: https://postgr.es/m/ak7Ekv2-L-G55-YD%40nathan

Fix Hash Join performance issue when hashing NULL values

adf97c156 allowed expression evaluation to perform hashing, and
subsequently 9ca67658d fixed a memory stomping bug in that commit
that caused unrelated-to-hashing expression op steps to stomp on the
intermediate hash value.  The intermediate hash value needs to be
maintained when hashing multiple hash keys.  9ca67658d didn't quite get
things right when in "strict" mode when it aborted hashing early after
encountering a NULL hash key.  What was meant to happen was that the
expression returns NULL directly to indicate to the caller the value
hashed to NULL.  The problem was that any EEOP_HASHDATUM_FIRST_STRICT or
EEOP_HASHDATUM_NEXT32_STRICT op step that didn't belong to the final
key to be hashed would have its op->resnull and op->resvalue pointing to
the location to store the intermediate hash value.  That's correct for
non-NULLs since we bit-rotate the intermediate value and continue hashing,
but with the strict case, when we get a NULL key, we immediately jump to
the "jumpdone" step.  The problem is the jumpdone step expects the
ExprState resnull and resvalue fields to be set (as they would be if we
didn't abort hashing early due to the NULL), but when we aborted early,
the ExprState fields never got set.  This would result in inserting
records into the hash table that would never match to any join partner,
which is a waste of CPU and memory.

Here we fix this by having EEOP_HASHDATUM_FIRST_STRICT and
EEOP_HASHDATUM_NEXT32_STRICT populate the ExprState resnull and resvalue
fields directly when the value to hash is NULL.

Although Hash Agg and Hashed Subplans do use hashing from ExprStates,
those were unaffected by this bug, as neither of those uses the STRICT op
steps.

Thanks to Tomas Vondra for finding the offending commit.

Reported-by: Dan Stefura <dstefura@bluecatnetworks.com>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/YQBPR0101MB89738FB972FBD02A3640C6D3D6C92@YQBPR0101MB8973.CANPRD01.PROD.OUTLOOK.COM
Backpatch-through: 18

Improve wording of sequence origin warning in logical replication.

check_publications_origin_sequences() warns when a subscription with
origin = NONE synchronizes sequence values that may have originated from
another subscription. The existing warning is phrased in terms of
copy_data and copying data, which is appropriate for table synchronization
but misleading for sequence synchronization.

Reword the warning, detail, and hint to describe sequence synchronization
and the associated origin = NONE semantics more accurately.

Also fix a typo ("rathen" -> "rather") in a comment in sequencesync.c.

Reported-by: Noah Misch <noah@leadboat.com>
Reported-by: Peter Smith <smithpb2250@gmail.com>
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 19, where it was introduced
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

Fix error handling in port's getopt_long() for missing argument

In the long option error path, the code previously returned BADARG
immediately when optstring[0] == ':' for a missing required argument,
without advancing "optind" or resetting "place". This error handling
was inconsistent with the short option path, where both updates are
performed before returning BADARG (and inconsistent with libc,
additionally..).

This affects platforms where our port version of getopt_long() is used,
a concept that should be limited to WIN32 these days. An argument could
be made in favor of a backpatch, but this could lead to a slight
different error handling, for a report that would only show up when
using incorrect option combinations.

Author: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/SY7PR01MB10921AF81F18A8BCFA06388BEB6C22@SY7PR01MB10921.ausprd01.prod.outlook.com

Fix issue with RANGE's DEFAULT partition pruning

Partition pruning for RANGE-partitioned tables could mistakenly prune
the DEFAULT partition in some cases when it was not valid to do so,
which could lead to rows missing from query results.

The only known cases where this could happen is when combining pruning
steps from an IS NOT NULL clause with other steps that matched to the
DEFAULT partition.  This could occur due to RANGE partitioned tables
having two distinct internal representations for marking if the DEFAULT
partition should be scanned.  The IS NOT NULL steps would mark the
"scan_default" boolean, but other steps created for different purposes
could mark a bound_offset Bitmapset, which would ultimately translate into
also scanning the default partition.  This could all fail after multiple
steps were combined with a combine intersect operator, as that will
intersect the bound_offset bits and only set scan_default if all pruning
steps have that flag set.  When both input steps to the intersect operator
had different representations of whether to scan the DEFAULT partition,
the resulting intersect step result would contain neither representation.

Here, we fix this by having the IS NOT NULL pruning result mark the
bound_offsets so that it uses both representations to mark that the
DEFAULT partition must be scanned.

Reported-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Diagnosed-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CA+COZaDXrfTaBjLE=Z79MTaH6Xun1V4PeKxLvCNv8mXS8wn0rw@mail.gmail.com
Backpatch-through: 14

Use [re]palloc_array() in buffile.c and fd.c

The code paths patched in this commit fix some of remnants not addressed
by 1b105f9472bd. We should have more holes in the tree that could
benefit from stronger type safety guarantees. These hypothetical holes
could be addressed later; this finishes the job in storage/file/ for the
backend code.

Author: Tristan Partin <tristan@partin.io>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/DKBAWPZ2QDOS.10Y3JDUNKFMXX@partin.io

Fix incorrect Result node flattening logic

This fixes some incorrect flattening of nested Result nodes during
create_plan that was introduced by f2bae51df. That commit failed to
maintain the logic that checks for subplans and gating quals from the
nested Result node before flattening, and that could result in the nested
gating qual and subplan being lost, which could produce incorrect results.

Bug: #19579
Reported-by: Viktor Leis <leis@in.tum.de>
Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/19579-e6296b6c9fc0591c@postgresql.org
Backpatch-through: 19

Fix background psql session cleanup in 051_effective_wal_level.pl.

Commit 6aba42c660c added quit() calls for two background psql sessions
whose slot creation is canceled by pg_cancel_backend(). Both sessions
ran with the default ON_ERROR_STOP=1 and ended their script with \q,
so psql exited as soon as the cancellation error arrived. quit() then
wrote another \q to the already-closed pipe, making the test die with
"ack Broken pipe".

Run both sessions with on_error_stop => 0 and drop the trailing \q, so
that psql stays at the prompt after reporting the error and quit() can
shut it down cleanly.

Discussion: https://postgr.es/m/CAD21AoCZY1fKYgfkvHGWGiXpatUKd23FSLnDCL4m9bWFjdXNZw@mail.gmail.com
Backpatch-through: 19

Fix races between deactivation of logical decoding and slot creation.

On standbys, logical decoding can be deactivated while a logical slot
is being created: either by replaying an
XLOG_LOGICAL_DECODING_STATUS_CHANGE record, or by the end-of-recovery
transition upon promotion, which deactivates logical decoding if no
valid logical slot exists. Both could interleave with a check of the
logical decoding status performed before creating a new slot because
the slot invalidation executed as part of the deactivation cannot find
a slot being created.

For regular slot creation on standbys, EnsureLogicalDecodingEnabled()
assumed that logical decoding must still be enabled during recovery
since the caller had already checked it, tripping an assertion failure
if a concurrent deactivation interleaved.

For slot synchronization, the local slot could be created and
persisted based on the remote slot information fetched before the
deactivation was replayed, leaving a valid slot whose restart_lsn
precedes the deactivation.

Fix both paths by re-checking the logical decoding status after the
new slot has been created: regular slot creation raises an error, and
slot synchronization skips persisting the slot. If the deactivation
happens after the re-check instead, it is guaranteed to invalidate the
newly created slot.

Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAD21AoDEB99VtNbQdDrNd=1gQupJNGMfW_5kdnxq03Q82EK3ag@mail.gmail.com
Backpatch-through: 19

Reject non-finite reltuples when restoring stats

When restoring relation stats, pg_restore_relation_stats() rejected
calls with (reltuples < -1.0). But that is insufficient - Infinity and
NaN values both pass that check, and get stored in pg_class verbatim.
This can have various undesirable consequences.

Fixed by rejecting non-finite reltuple values, in the same non-fatal way
as for the existing checks (emit WARNING and skip the update). Adds a
regression test to stats_import for these non-finite values, and to
check the -1.0 special value is still accepted.

Backpatch to 18, where pg_restore_relation_stats() was introduced.

Patch by Jan Nidzwetzki, minor commit message tweaks by me.

Author: Jan Nidzwetzki <jan@planetscale.com>
Discussion: https://postgr.es/m/518BA772-8026-412A-AA8F-A7FE4C6B3717@planetscale.com
Backpatch-through: 18

Initialize bs_reltuples in parallel GIN builds

Index builds update pg_class.reltuples for the table. In parallel GIN
builds, workers track the number of processed rows, and report it to
the leader, who then updates the pg_class with a total. However,
gin_parallel_build_main failed to initialize the bs_reltuples field,
leaving it set to whatever happens to be on the stack (which may be
bogus values like Infinity or NaN, or just impossibly high values).

If such values get reported to the leader and stored in pg_class, that
can have serious consequences. The pg_class.reltuples field is used to
decide when a table is due for autovacuum or autoanalyze, and if it
happens to be set to a bogus value, that may never happen. The field is
also used by the optimizer when calculating costs.

Fixed by initializing bs_reltuples together with the rest of the build
state. The bs_numtuples was initialized later, but it seems cleaner to
just initialize all the fields at once.

After a bogus value gets persisted in pg_class, affected systems are
unlikely to self-heal. That would require an ANALYZE, but preventing
that is one of the consequences. We have considered forcing autoanalyze
in these cases, but there's not a good way to reliably identify bogus
values (except for a small minority like Infitiny/NaN).

A manual ANALYZE on (possibly) affected tables is the only solution.

Backpatch to 18, where parallel GIN builds were introduced.

Reported-by: Jan Nidzwetzki <jan@planetscale.com>
Discussion: https://postgr.es/m/518BA772-8026-412A-AA8F-A7FE4C6B3717@planetscale.com
Backpatch-through: 18

Make sure to detach injection points for re-attaching

The new test for enabling data checksums with concurrent CREATE
DATABASE calls use the same injection points as a previous test
but accidentally missed detaching the injection point first.

Fix by detaching the injection point in the PG_TEST_EXTRA SKIP
block to make it can be reused.  Pointed out by buildfarm member
porpoise which failed with:

    die: error running SQL: 'psql:<stdin>:1:
     ERROR: injection point "datachecksumsworker-fake-temptable-wait"
        already defined'

Backpatch to v19 where online checksums were introduced.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Buildfarm member porpoise
Reviewed-by: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com>
Discussion: https://postgr.es/m/28CF6FD9-E1C4-4C04-8270-E3305AC46171@yesql.se
Backpatch-through: 19

Add a couple of commits to .git-blame-ignore-revs

Skip SUBSCRIPTION TABLE TOC entries with --no-subscriptions.

pg_dump in --binary-upgrade mode emits "SUBSCRIPTION TABLE" TOC entries to
preserve pg_subscription_rel state across pg_upgrade. When such a dump
was restored with --no-subscriptions, _tocEntryRequired() skipped the
"SUBSCRIPTION" entry but not the associated "SUBSCRIPTION TABLE" entries,
so the restore would try to apply subscription-relation state for a
subscription that was never created.

Skip "SUBSCRIPTION TABLE" entries as well when no_subscriptions is set.

This can happen when pg_subscription_rel has entries, the dump is taken
with --binary-upgrade, and it is restored with --no-subscriptions.

Reported-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Shlok Kyal <shlok.kyal.oss@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/OS9PR01MB121493DA4C1A7748B11A646D8F5C02@OS9PR01MB12149.jpnprd01.prod.outlook.com

Fix SystemTap dtrace warning about smgr probe argument types.

Commits ca326e903 and 1f8c504e3 widened the byte-count arguments of
smgr__md__read__done and smgr__md__write__done to "long long int".
SystemTap's dtrace(1) fails on that specific spelling and falls back
with a warning (misreported near the previous probe). Use ssize_t
and size_t instead, matching the md.c call sites.

Revise probes.d's note about which types are usable as probe
arguments: recommend using system-supplied type names (macOS dtrace
rejects names like PostgreSQL's uint64), and call out "long long int"
as a known SystemTap failure case rather than a wider failure mode.

Also, update the monitoring.sgml entries for these probes,
which were missed by the prior commits.

Reported-by: Laurenz Albe <laurenz.albe@cybertec.at>
Author: Andrey Rachitskiy <pl0h0yp1@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/697d3c88568442cd637d8453d129f1bb14bbd2a8.camel@cybertec.at

pgindent fix for a84eca6627f

ssl: Use the correct feature macros for TLS protocol support

Our test for if the underlying TLS library supported a specific
version tested against the TLSX_Y_VERSION set of macros. These
are however always defined, regardless of if the library was
built without support for the specific protocol version.  Fix
by using the feature test macros OPENSSL_NO_TLSX_Y which are
intended for this usecase.

The previous coding held no risk of protocol downgrade against
the underlying library, a library not supporting the protocol
version selected would simply error out as the feature isn't
available.  This can be easily verified using a modern version
of LibreSSL, which in version 3.8 disabled TLS1 and 1.1 by
default.  Once we bump our minimum supported version of LibreSSL
to 3.8+ we can add a test for this.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Yilin Zhang <jiezhilove@126.com>
Discussion: https://postgr.es/m/68B9881D-DAA8-467D-A251-C96E98E57BA0@yesql.se

ssl: Replace deprecated API to get commonName

X509_NAME_get_text_by_NID was deprecated in OpenSSL 4.0.0, and could
be removed in a future version of OpenSSL. The replacement APIs are
available in all versions of OpenSSL and LibreSSL that we support so
we can easily change to make the code future proof.

The reason for the deprecation is that X509_NAME_get_text_by_NID can
only grab the first entry in a list, and doesn't handle multibyte
strings well. The fix is to get the index of the name entry with
X509_NAME_get_index_by_NID and use X509_NAME_get_entry to extract
the data.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Yilin Zhang <jiezhilove@126.com>
Discussion: https://postgr.es/m/68B9881D-DAA8-467D-A251-C96E98E57BA0@yesql.se

ssl: Use TLS_method instead of deprecated SSLv23_method

The SSLv23_method() function has been an alias for TLS_method since
2015 (OpenSSL commit 32ec41539b5b) so we should use the appropriate
name to avoid confusion.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Yilin Zhang <jiezhilove@126.com>
Discussion: https://postgr.es/m/68B9881D-DAA8-467D-A251-C96E98E57BA0@yesql.se

ssl: Remove static var tracking tls_init_hook warnings

If the TLS init hook is defined in conjunction with ssl_sni we
issue a warning to help the user re-configure the cluster. To
avoid drowning the log in warnings, we log only once instead of
once per host. This removes the static variable tracking the
warning to aid future multithreading efforts.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Yilin Zhang <jiezhilove@126.com>
Discussion: https://postgr.es/m/68B9881D-DAA8-467D-A251-C96E98E57BA0@yesql.se

doc: remove added space within synopsis replaceable tags

Restructuring the tags makes the output consistent and doesn't require
added spaces.

Reported-by: Peter Smith
Author: Peter Smith

Discussion: https://postgr.es/m/CAHut+Pu8JahGm76CMdpzH350pHJedA4R2b8JmOim3+m3yxft3Q@mail.gmail.com

Backpatch-through: 19

Fix stale comment in parallel_vacuum_main().

The comment claimed that a parallel vacuum worker has only the
PROC_IN_VACUUM flag because parallel vacuum is not supported for
autovacuum, but commit 1ff3180ca01 allowed autovacuum to use parallel
vacuum workers.

The assertion itself still holds: the leader, whether a backend
running VACUUM or an autovacuum worker, sets PROC_IN_VACUUM before
taking its snapshot, and a parallel worker inherits the flag when
importing the leader's snapshot. The leader's other flags don't reach
the worker, since the snapshot import copies only the PROC_XMIN_FLAGS
bits and PROC_IS_AUTOVACUUM is never set on parallel workers, which
run as regular background workers. Reword the comment to explain that.

Oversight in commit 1ff3180ca01.

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CALj2ACVwQ4WABqq8Lnf+VZEJ45jcTFhyFLFr_ctfS4=QLL-r5w@mail.gmail.com
Backpatch-through: 19

Fix autovacuum-induced flakiness in backwards scan test.

Two of the permutations in backwards-scan-concurrent-splits rely on
their VACUUM step deleting the leaf pages that the waiting backwards
scan will have to recover from.  VACUUM can only do that when it's able
to remove the index tuples whose heap tuples the concurrent session just
deleted.  An autovacuum worker holding a snapshot holds back the
removable cutoff, which leaves the pages non-empty, and so undeleted,
causing the test to fail spuriously.

To fix, wait for the removable cutoff to advance past the deletions
before the scan acquires its snapshot.  This is much like commit
1c64d2fc, which dealt with the same hazard in nbtree_half_dead_pages by
adding the wait_prunable() helper that we reuse here.

Oversight in commit e395fbd3.

Author: Peter Geoghegan <pg@bowt.ie>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b61d9944-d7a3-45f3-b69a-f18c8bfbbbd0@gmail.com

Fix cascading standby reconnect failure after archive fallback

A cascading standby could fail to reconnect to its upstream standby with
"requested starting point ... is ahead of the WAL flush position" after
falling back to archive recovery.  This happened because archive
recovery processes whole segment files, so after replaying a segment the
cascade's next read position lands at the start of the following
segment, which is ahead of the upstream's flush position reported by
GetStandbyFlushRecPtr() (still inside the just-replayed segment).

Fix by having the walreceiver check the upstream's current WAL flush
position via IDENTIFY_SYSTEM before issuing START_REPLICATION.
IDENTIFY_SYSTEM already returns this position (as xlogpos), but
walrcv_identify_system() previously discarded it; now we have a use for
it.  If the requested start point exceeds the upstream's flush position
on the same timeline, the walreceiver waits for
wal_retrieve_retry_interval and retries.

The wait is limited to gaps of at most one WAL segment, which is the
expected case from the segment-granularity of archive recovery.  Larger
gaps indicate the upstream is genuinely behind, so START_REPLICATION is
allowed to proceed (and fail) normally, letting the startup process fall
back to other WAL sources.  The first wait is logged at LOG level;
subsequent waits are demoted to DEBUG1 to avoid log noise.  The
walreceiver honors wal_receiver_timeout during the wait, so it will exit
if the upstream doesn't catch up in time.

To preserve ABI compatibility on back branches, the flush position from
IDENTIFY_SYSTEM is communicated via a new global variable
(WalRcvIdentifySystemLsn) rather than changing the signature of
walrcv_identify_system().

The bug was introduced in Postgres 9.3 by commit abfd192b1b5b, which
added a flush-position check in StartReplication() that rejects requests
ahead of the upstream server's WAL flush position.

Author: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/CA+nrD2cTuTkkX5WXVZengTYYZbAO6zV8K+Tri-R0fbLFuoyMBA@mail.gmail.com

doc: Add getdatabaseencoding to function docs

The getdatabaseencoding function was added in bf00bbb0c494 in 1998 but
was never documented. While mostly used in tests, there is no reason
not to document it as this function isn't going anywhere and is already
used in extensions.

Author: Ian Barwick <barwick@gmail.com>
Reviewed-by: Thom Brown <thom@linux.com>
Reviewed-by: surya poondla <suryapoondla4@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAB8KJ=ij+pznQGub=DkyJuKL=tC=Q=07qSahTyw7TLb0DdNJsg@mail.gmail.com

Protect PGPROC lookup when terminating background workers

TerminateBackgroundWorkersForDatabase() uses BackendPidGetProc() and,
until now, accessed fields of the returned PGPROC after releasing
ProcArrayLock, including its database OID.  If the PGPROC slot is
recycled during this window, the database OID being checked may belong
to a different backend, causing an unrelated background worker to be
terminated.

Triggering this bug requires a very narrow race: the background worker
identified by BackendPidGetProc() must exit, its PGPROC slot must be
released and reused, and only then must
TerminateBackgroundWorkersForDatabase() examine the database OID.

TerminateBackgroundWorkersForDatabase() holds BackgroundWorkerLock,
preventing parallel workers and dynamically registered workers (such as
those created by worker_spi) from reusing the slot.  As far as I know,
the only plausible scenario is a static background worker that exits and
is restarted quickly enough to reuse the same PGPROC slot within the
race window.  In practice, this race is extremely unlikely, still
reachable in theory.

Oversight in f1e251be80a0.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Aya Iwata <iwata.aya@fujitsu.com>
Reviewed-by: Haibo Yan <tristan.yim@gmail.com>
Discussion: https://postgr.es/m/78E81763-EA1D-4788-9741-4092BCB997A5@gmail.com
Backpatch-through: 19

Avoid accumulating relation locks during sequence synchronization.

While collecting the sequences to synchronize, the sequence sync worker
opened each INIT sequence with RowExclusiveLock and held it until the
transaction committed. With many such sequences, this could exhaust the
shared lock table and fail with "out of shared memory".

The worker only reads each sequence's identity (namespace and name) here
and needs it to stay stable while read, for which AccessShareLock is
enough, as it conflicts with the AccessExclusiveLock taken by DROP,
RENAME, and SET SCHEMA. Take that lock instead and release it as soon as
the identity is read. The later synchronization re-opens each sequence, so
it does not rely on the lock being retained.

Reported-by: Noah Misch <noah@leadboat.com>
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 19, where it was introduced
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

Use more strlcpy() in two-phase transaction code

This commit replaces two calls of strcpy() and one call of strncpy() to
use strlcpy(), which are patterns that static analyzers (mostly LLMs, it
seems) have been complaining regarding buffer overflow risks.

The existing calls are safe, here are more details for each one of them:
- MarkAsPreparingGuts()'s strcpy() was guarded by MarkAsPreparing().
- PrepareRedoAdd()'s strcpy() is safe because the record-level CRC check
prevents corrupted data from reaching it unless intentionally
crafted. The replay code also assumes that the GID is within the allowed
bounds, as WAL records are trusted.
- Similarly, ParsePrepareRecord() stores its GID in a buffer bounded by
GIDSIZE while trusting the length provided by the record.

As a result, these changes are purely cosmetic. They adopt a more
defensive coding style and should also silence some of the static
analysis reports received recently.

Author: Matt Suiche <matt@tolmo.com>
Discussion: https://postgr.es/m/CAGf6Lfx2kbQfcEnCi99V2i65JSWD6ij_E29F+UkY=TyMUyeG6A@mail.gmail.com

Fix planner's nullability/strictness logic for ScalarArrayOpExpr.

find_nonnullable_rels and find_nonnullable_vars mistakenly treated a
ScalarArrayOpExpr that could return FALSE as strict, but that's okay
only at top level of a qual expression; further down, we've got to
insist on a guaranteed-NULL result.  The result was that we could draw
mistaken conclusions about whether outer joins can be simplified, if
the decision hinged on a non-top-level ScalarArrayOpExpr with a
potentially-empty array argument.

I believe this error dates to commit 72a070a36, which taught
find_nonnullable_rels to descend into non-top-level parts of qual
expressions.  is_strict_saop (added earlier by 72153c058) already had
enough intelligence to do the case correctly, but it wasn't passed the
proper flag, ie "top_level" needs to be passed for "falseOK".
e006a24ad copied that mistake into find_nonnullable_vars.

Later, over-eager refactoring in commit 2f153ddfd broke
contain_nonstrict_functions' handling of ScalarArrayOpExpr by treating
it as though it were no different from an OpExpr.  It is, because
we must also prove the array is non-empty before concluding that the
expression is strict.  This could result in misclassifying an
expression as strict when it is not, leading to assorted planning
mistakes such as inlining a SQL function that shouldn't be inlined.
We can almost fix this by just re-adding the previous handling of
ScalarArrayOpExpr in that function, but doing only that would lead to
also calling check_functions_in_node() and thus redundantly checking
the operator's strictness.  Avoid that by turning the if-series into
an else-if chain, as it arguably should have been all along.

The reason these errors have escaped detection for decades is that
they are exposed only in arcane corner cases.  ScalarArrayOpExpr with
an empty array isn't typical usage, and even when that's possible
several other conditions apply before the planner can reach a mistaken
conclusion.  While it's possible to build test cases demonstrating
these mistakes, I (tgl) judged them too indirect and special-purpose
to justify consuming regression test cycles forevermore.

Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAJTYsWV3vqRJmST-gv1NsXEef-zOnjVJpYS910aBaiuMij4nFg@mail.gmail.com
Discussion: https://postgr.es/m/CAJTYsWWcLGmz0f8_QPP_Liq-fc7-geiFSCdqoq3XGeRHPPsWeA@mail.gmail.com
Backpatch-through: 14

Handle invalid and dropped databases during checksum enable

Enable errors out early with a hint when an invalid database exists,
since the worker cannot connect to it and its files stay on disk.

A worker that started but failed gets the same dropped-database
heuristic as one that failed to start, so a concurrent drop during
processing no longer aborts the whole run. The existence check locks
the database first, otherwise a DROP DATABASE ... WITH (FORCE) which
killed the worker is still only halfway done and the database looks
like it is there to stay.

Backpatch to v19 where online checksums were introduced.

Author: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAN4CZFOGdqxtZ5-6gb4apqmvoH=Z+TNH8RKJ3mVtoR1HirKQWg@mail.gmail.com
Backpatch-through: 19

Recheck checksum state before file_copy during CREATE DATABASE

The file_copy strategy check in createdb() runs during option
validation, before the transaction has an XID and before the
pg_database row exists, so the datachecksumsworker launcher
can start in that window and see neither the new database nor
the transaction creating it. It then raw-copies a template
that was not processed yet, and those files stay unchecksummed,
failing verification from then on.

Recheck the state in CreateDatabaseUsingFileCopy(): the XID is
assigned by then, so a launcher starting after this point waits
for the transaction and finds the new database, and the copy
errors out instead. Add an injection point before the catalog
insert to test the window.

Backpatch to v19 where online checksums were introduced.

Author: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAN4CZFPEBsz8JeY4ixQ1V4ZL_xOY6pJaZS8ZLGH7R+wF--pEtg@mail.gmail.com
Backpatch-through: 19

Fix logical decoding of empty prepared transactions.

A two-phase transaction that is assigned an XID but produces no change
to be decoded -- for example, one that only acquires row locks via
SELECT ... FOR SHARE -- has no base snapshot in the reorder
buffer. ReorderBufferReplay() already skips such a transaction at
PREPARE time and never invokes the begin_prepare/change/prepare
callbacks for it, but ReorderBufferFinishPrepared() still called the
commit_prepared (or rollback_prepared) callback. As a result a
spurious COMMIT/ROLLBACK PREPARED was sent to the output plugin with
no preceding PREPARE. For the built-in subscriber this breaks
replication (the apply worker fails to find the prepared transaction),
and test_decoding could even crash.

Fix this by detecting an empty transaction (base_snapshot == NULL) in
ReorderBufferFinishPrepared() and cleaning it up without invoking the
commit/rollback prepared callbacks, mirroring the existing empty
transaction handling in ReorderBufferReplay().

On v18 and newer versions, commit 072ee847ad4 changed
ReorderBufferPrepare() to send the prepare whenever it had not already
been sent, which also fires for empty transactions and emits a
spurious PREPARE. On those branches ReorderBufferPrepare() is
therefore additionally guarded with base_snapshot != NULL. This guard
and the Assert(!rbtxn_sent_prepare()) added in
ReorderBufferFinishPrepared(), are not necessary on v17 and older
versions: there ReorderBufferPrepare() only sends a prepare for
concurrently-aborted transactions (which never applies to an empty
transaction) and the RBTXN_SENT_PREPARE flag does not exist.

Back-patch to v14, where decoding of two-phase transactions was
introduced.

Bug: #19556
Reported-by: Alexander Kozhemyakin <a.kozhemyakin@postgrespro.ru>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/19556-daa6d7ea65054d48@postgresql.org
Backpatch-through: 14

Fix pg_get_publication_tables() failure with concurrent DROP TABLE.

pg_get_publication_tables() collects the OIDs of the published tables
on its first call, without locking them, and then reopens each table
later, once per result row, to compute its column list and fetch its
row filter. The reopen used table_open(), which errors out with "could
not open relation with OID" if the table has been dropped in the
meantime. This could happen for any published table without an
explicit column list, which is every table in FOR ALL TABLES and FOR
TABLES IN SCHEMA publications, but also FOR TABLE entries without a
column list. The failure is common in environments where many tables
are created and dropped while publication tables are being queried,
e.g. by table synchronization on a subscriber.

Fix by opening every table with try_table_open(), which returns NULL
if the relation no longer exists, and skipping the table in that
case. Concurrently dropped tables are thus simply absent from the
result set, which is the expected point-in-time behavior.

As a side effect, tables with an explicit column list, which were
previously returned without being opened, are now also locked with
AccessShareLock, so the function can block behind concurrent DDL on
such tables where it previously did not.

Backpatch to v16, where we added the table_open() call in
pg_get_publication_tables().

Author: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Ajin Cherian <itsajin@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/CALj2ACVYYooWH-5tJ6cPKkU%2BmutVxwb_z4S%2BqAi-zdrFqxXE2Q%40mail.gmail.com
Backpatch-through: 16

Restore vacuum_delay_point() in GIN posting-tree leaf vacuum

Commit fd83c83d094 turned the recursive posting-tree cleanup in
ginVacuumPostingTreeLeaves() into an iterative sweep that follows the
tree's leaf pages via their rightlinks.  The recursive version called
vacuum_delay_point() while processing the tree, but that call was removed
and never re-added to the new loop.  As that commit only set out to fix a
deadlock, the removal appears to have been unintentional.

Consequently the leaf-page sweep of a single posting tree runs with no
vacuum_delay_point(), and therefore no CHECK_FOR_INTERRUPTS().  A posting
tree stores all the TIDs for one indexed key, so for a frequently
occurring key it can span a large number of leaf pages.  While such a
tree is being vacuumed the operation ignores vacuum_cost_delay and does
not respond to query cancellation or statement_timeout; an autovacuum
worker likewise cannot be interrupted mid-sweep when another backend
requests a conflicting lock.

Restore the call, placed after the current page has been unlocked and
released so that no buffer content lock is held across a potential delay
(cf. 21c27af65fb).  The sibling loops in ginbulkdelete() and
ginvacuumcleanup() already call vacuum_delay_point() once per page.

Author: Paul Kim <mok03127@gmail.com>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: solai v <solai.cdac@gmail.com>
Discussion: https://postgr.es/m/178447127453.110.12276981925360691905%40mail.gmail.com
Backpatch-through: 14

Avoid RETURNING side effects for FOR PORTION OF leftovers.

UPDATE/DELETE ... FOR PORTION OF inserts leftover rows for the
untouched parts of the original row. These hidden inserts should not
affect the command tag or ROW_COUNT, so they call ExecInsert() with
canSetTag set to false.

However, ExecInsert() still processed the RETURNING list whenever the
target ResultRelInfo had ri_projectReturning set. That caused
RETURNING expressions to be evaluated for leftover rows even though
their results were discarded. As a result, expressions with side
effects and information-leaking functions could be executed on the
leftover rows, in addition to the visibly updated or deleted row.

Fix by having ExecInsert() skip RETURNING processing when it is
handling an internal FOR PORTION OF leftover insert. Use both the
presence of a FOR PORTION OF clause and mtstate->operation ==
CMD_INSERT for this check, so that the auxiliary INSERT of a
cross-partition UPDATE with a FOR PORTION OF clause still processes
RETURNING normally.

Back-patch to v19, where support for FOR PORTION OF was added.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://postgr.es/m/07C125E5-F6ED-460C-A394-E6503DAE18FB@gmail.com
Backpatch-through: 19

Fix portability issue in authentication test 003_peer

The mapped user name is built upon the OS user name of the environment
where the test is run. Depending on the characters used in the OS user
name, CREATE ROLE may not get parsed (the author has mentioned hyphens
as one case), causing a failure of the test.

Let's use double-quotes around the mapped user name, which should be a
solution good enough for the environments where this test tends to run.
The buildfarm issued no complaint over the years.

Oversight in 3c4e26a62c31, so backpatch down to v19. Perhaps
3c4e26a62c31 and this commit should be backpatched further down, but
let's leave that for another day, if it proves necessary.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/20260727133857.fbd23d43d422f10f376a8bee@sraoss.co.jp
Backpatch-through: 19

Fix propagation of indimmediate flag in index_create_copy()

index_create_copy is used to create copy definitions of existing indexes.
Currently, it passes 0 as constr_flags to index_create(), which results
in the copied index to always be created as immediate (indimmediate set
to true).  For deferrable unique constraints, it means that the
transient index used during the phase 2 of REINDEX CONCURRENTLY forces
immediate constraint checks on concurrent inserts, which can cause
unexpected constraint violations based on the definition of the parent
table, inconsistently set in the copied index.

To fix this without violating the contract of constr_flags (which should
only be used when creating constraints) and without relaxing the strict
assertion in index_create(), this introduces a new index creation flag:
INDEX_CREATE_DEFERRABLE.  If set, a copied index's indimmediate is set
to false, meaning that unique constraints are not enforced immediately
on insertion, but at transaction commit time.

An isolation test for REINDEX CONCURRENTLY is added, based on an
injection point waiting after phase 1 of the operation, where an index
copy has been built and is able to accept DMLs for its validation in
phase 2.  The test is tentatively backpatched down to v17.
INJECTION_POINT() is outside a transaction context, which should be fine
on HEAD since 8daeaa9b642c but I suspect may cause issues in v19 and
older branches due to the wait facility depending on condition variables
and a DSM setup, but let's see what the buildfarm tells.

Author: Nitin Motiani <nitinmotiani@google.com>
Discussion: https://postgr.es/m/CAH5HC97JmjPpgiQOqW9xm8qXhNiu7zZ1Qh+FfhEESJuDv69kuQ@mail.gmail.com
Backpatch-through: 14

Further improve the names generated for indexes on expressions.

Commit 181b6185c failed to do anything useful with a whole-row Var,
deeming it "fishy". But it is legal to put such a Var into an
expression index column, so let's expand it as the name of the table.

Another problem reachable via that one is that we could generate an
empty index column name, which isn't really legal although by chance
nothing complained about it. It's not clear whether any other such
cases remain, but as cheap insurance let's use "expr" if the tree walk
fails to generate any text.

Reported-by: Chauhan Dhruv <chauhandhruv351@gmail.com>
Author: Chauhan Dhruv <chauhandhruv351@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CANWwWcp_DCJjq8pomeqp6W=fbygvzXXQO028VDJ9_6sLPjQnVA@mail.gmail.com

Fix race condition when enabling logical decoding concurrently.

With wal_level = 'replica', logical decoding is enabled on demand
when the first logical replication slot is created:

When enabling logical decoding, EnableLogicalDecoding() flips the
shared logical_decoding_enabled flag and writes an
XLOG_LOGICAL_DECODING_STATUS_CHANGE record so that standbys follow the
status change. The initial "already enabled?" check and the WAL record
write happen under two separate acquisitions of
LogicalDecodingControlLock, since the lock must be released while
waiting for the ProcSignalBarrier: processes absorbing the barrier
acquire the same lock in shared mode.

Consequently, if two backends concurrently created the first logical
slots, both could pass the initial check and both write a
status-change record. The redundant record lands after the decoding
start point already reserved by the other backend's slot, so decoding
that slot processes the record and fails with "unexpected logical
decoding status change", as xlog_decode() assumes that no such record
can appear within the WAL range any slot decodes.

Fix by re-checking the status after re-acquiring the lock, so that
only the backend that actually performs the disabled->enabled
transition writes the WAL record.

Reported-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Author: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAFC+b6oYzmAgp7F0ivrhfZT46-CjvCTrU9pWuMNcem-52YjOTw@mail.gmail.com
Backpatch-through: 19

Deparse FOR PORTION OF using the range column's current name.

Commit 8e72d914c recorded the range column's name in ForPortionOfExpr
and used that for deparsing FOR PORTION OF. This gives the wrong
answer if the ForPortionOfExpr is saved in a rule or SQL function and
then the column gets renamed. Drop the ForPortionOfExpr.range_name
field; instead fetch the current column name from the catalogs when
needed.

Also drop ForPortionOfState.fp_rangeName, which wasn't being used
anywhere.

Full disclosure: an earlier draft of this patch was made with
Claude Opus 4.8.

Reported-by: John Naylor <johncnaylorls@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CANWCAZYFEpJ5Oi45gi4q9Y6LYa4_oiAXxuNNWe-1ym-i0fF8Pw@mail.gmail.com
Backpatch-through: 19

pg_resetwal: do not allow zero next multixact offset

Offset 0 is the "invalid" marker in pg_multixact/offsets since offsets
went 64-bit and the allocator stopped skipping it. pg_resetwal could
still produce it via -O 0 or guessed control values, breaking the first
multixact created after the reset ("MultiXact n has invalid offset",
and vacuum of the affected table fails from then on). Reject -O 0 like
-m and -o already do, and guess 1 like initdb does.

Author: Zsolt Parragi <zsolt.parragi@percona.com>
Discussion: https://www.postgresql.org/message-id/CAN4CZFNoO6MUkg526TmA=mC_RjY2gp4VKCnvK6y12v3ppOkhJA@mail.gmail.com
Backpatch-through: 19

Fix issues in logical replication sequence synchronization.

1. Stop a running sequence synchronization worker when
ALTER SUBSCRIPTION ... DISABLE is executed. The worker did not reread its
subscription after starting a transaction, so it kept running with a stale
copy and missed the disable. It now calls maybe_reread_subscription()
after StartTransactionCommand(), matching the apply worker.

2. Restore the invariant that publisher-side synchronization slots are
dropped last during ALTER SUBSCRIPTION ... REFRESH PUBLICATION. The
slot-drop loop now runs after the sequence-removal loop, so the
non-transactional slot drops happen only after all catalog changes that
could still be rolled back on error.

3. Restore psql tab completion for
ALTER SUBSCRIPTION ... REFRESH PUBLICATION WITH (.

4. Make pg_stat_subscription report NULL for the fields that do not apply
to a sequence synchronization worker, which does not stream from a
walsender, and update the documentation accordingly.

5. Update the pg_subscription_rel.srsublsn catalog documentation to
describe its semantics for sequence rows.

Reported-by: Noah Misch <noah@leadboat.com>
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 19, where it was introduced
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

Fix deparsing of JSON_ARRAY(subquery) with a FORMAT clause

Commit 8d829f5a0 introduced the JSCTOR_JSON_ARRAY_QUERY constructor
type so that ruleutils.c could deparse JSON_ARRAY(subquery) using its
original syntax, storing the transformed subquery in a new orig_query
field. However, the input FORMAT clause of JSON_ARRAY(subquery FORMAT
...) was not preserved for deparsing. The format was recorded only in
the executable expression kept in the func field, which ruleutils.c
does not inspect, so it is silently dropped.

This is more than cosmetic, because FORMAT JSON changes the result:
without it a text value is treated as a string to be quoted, while
with it the value is treated as already-formatted JSON.

To fix, record the input FORMAT in a new deparse-only field of
JsonConstructorExpr, alongside orig_query, and emit it in ruleutils.c.

Bump catalog version.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Ewan Young <kdbase.hack@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/4C89B193-7D54-4705-9CF9-F0D484B9E099@gmail.com
Backpatch-through: 19

Update .gitignore in test/modules/nbtree

Noticed while doing some routine work.

Oversight in e395fbd32a07.

Use direct hash lookup in logicalrep_partmap_invalidate_cb()

This replaces an O(N) hash_seq_search() loop by an O(1) lookup, removing
a TODO item, making the invalidation callback faster when dealing with
many relations. This can work because LogicalRepPartMap is keyed by a
partition OID, and a relmapentry's localreloid matches with it.

An assertion is added in logicalrep_partition_open() to enforce the fact
that localreloid matches with the hash key.

Author: DaeMyung Kang <charsyam@gmail.com>
Discussion: https://postgr.es/m/20260417174450.4158878-1-charsyam@gmail.com

Add _bt_set_startikey row compare test coverage.

Add pg_regress tests that exercise the row compare logic that commit
7d9cd2df added to _bt_set_startikey. Also add tests that exercise the
_bt_set_startikey SAOP array path.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-Wz=KjQsD2W2a=b51uH905=0mF6Le4evhWkN2FL1+uRPhUg@mail.gmail.com
Backpatch-through: 19

Add test coverage for nbtree backwards scans.

Backwards scans have unique concurrency rules: rather than unreservedly
trusting a saved left link, the scan optimistically rechecks its
pointed-to leaf page's right link (i.e. whether it still points back to
the page that _bt_readpage just read).  Usually, the left sibling of the
just-read page won't have changed, in which case the scan can proceed
with reading the left sibling as planned.  But it's possible that the
key space that the scan needs to read next is no longer covered by the
original left sibling page due to concurrent page splits and/or page
deletions.  When that happens, the scan must recover by relocating the
new/current left sibling of the just-read page.

Test coverage for backwards scans was limited to the happy path.  Add an
isolation test (and associated injection points) that test the recovery
path.  This covers several distinct recovery scenarios (concurrent page
splits, concurrent page deletions, and minor variants thereof).

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CAH2-WzmD+jUBOpFS2jrnqqrdPSAjoxqyL9FPKaE1BtnY=8Nntg@mail.gmail.com

Add tests for nbtree empty index predicate locking.

Add coverage for predicate locking of completely empty nbtree indexes,
where we must predicate lock the entire relation (instead of some
individual leaf page). Both paths that can find the index empty (and
must consider whether it's still empty after PredicateLockRelation
returns) are covered by a new isolation test that uses injection points.

Catalog relation scans skip the injection points. The waiting session
runs catalog queries of its own after arming the (session-local) points,
and could otherwise suspend itself with nothing lined up to wake it.

Follow-up to bugfix commits ce3f19e2 (the _bt_endpoint fix) and f9b7fc65
(the _bt_first/_bt_search fix).

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkNoTn3yXY0iGkSuavJ+sL8EROf+kitW+_2v2tJVWuKmA@mail.gmail.com

Add missing PGDLLIMPORT marker

Oversight in commit fb23cc7e81db.

Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru>
Discussion: https://postgr.es/m/ad5d772e-09d9-4248-97a4-0011afab9e71@postgrespro.ru

Fix another empty nbtree index SSI race.

Commit f9b7fc65 fixed a race when predicate-locking completely empty
btrees: without a buffer lock held, a matching key could be inserted
between _bt_search and the PredicateLockRelation call, so the scan would
miss concurrently inserted tuples while the writer wouldn't see the
reader's predicate lock. That commit only fixed _bt_first's _bt_search
path, though. Scans without useful insertion scan keys return early
from _bt_first via _bt_endpoint, which still didn't recheck if the
relation was empty.

To fix, add handling to _bt_endpoint that is analogous to the handling
added to _bt_search by commit f9b7fc65.

Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WzkNoTn3yXY0iGkSuavJ+sL8EROf+kitW+_2v2tJVWuKmA@mail.gmail.com
Backpatch-through: 14

psql: Allow pg_read_all_stats to see database size in \l+

pg_database_size() allows access to users who have either CONNECT
privilege on the target database or privileges of the pg_read_all_stats
role. However, previously, psql's \l+ checked only for CONNECT,
so users with privileges of pg_read_all_stats still saw "No Access" for
databases they could not connect to.

Fix this by making \l+ also check
pg_has_role('pg_read_all_stats', 'USAGE'), matching
pg_database_size()'s permission rules.

For back branches, emit the pg_read_all_stats check only when
connected to PostgreSQL 10 or later, since earlier releases do not have
that predefined role.

Backpatch to all supported versions.

Author: Christoph Berg <myon@debian.org>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/amCo6qRmnfPVk4-V@msg.df7cb.de
Backpatch-through: 14

Avoid reporting permission-denied publisher sequences as missing

Previously, if a sequence synchronization batch contained both a sequence
that had been dropped on the publisher and another for which the
replication role lacked SELECT privilege, the latter was reported
twice: once as a permission failure and again as missing on the
publisher.

This happened because the permission-denied sequence was not marked as
found on the publisher. As a result, when another sequence in the batch
was genuinely missing, the later missing-sequence check incorrectly
classified the permission-denied sequence as missing as well.

Fix this by marking the permission-denied sequence as found before
reporting the permission failure, so it is not later reported as
missing.

Reported-by: Noah Misch <noah@leadboat.com>
Author: Vignesh C <vignesh21@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CALDaNm3LsUjW7PahuCsbYAxajSF+S328tw5E9rF0erdh7dKOXw@mail.gmail.com
Backpatch-through: 19

doc: Add missing CREATE/ALTER PUBLICATION parameter descriptions

Document table_name, column_name, and schema_name in the CREATE
PUBLICATION and ALTER PUBLICATION reference pages. Also add anchors for
the ALTER PUBLICATION parameter list, matching the style already used by
CREATE PUBLICATION.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHut+Ptekz+TO4ui8-fiBm4Y+O2v=HQnkK_cW4G=w9ep8654EA@mail.gmail.com

Fix EXCEPT publication test to check subscriber

Commit fd366065e06 added tests intended to verify that rows inserted
on the publisher are replicated to the subscriber when using multiple
publications, with one excluding the target table via EXCEPT and
another including it.

However, the tests queried the publisher instead of the subscriber.
Since the rows were inserted directly into the publisher, the checks
would always succeed, providing no coverage of replication.

Fix this by querying the subscriber so the tests verify the replicated
state.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGfXUO7f4t6KNGurYwg6QsnLtpP0K3EACbAwYWtxGfKfQ@mail.gmail.com
Backpatch-through: 19

Validate subscription conninfo on owner change

For subscriptions using SERVER, changing the owner can change the
effective connection string. However, ALTER SUBSCRIPTION ... OWNER TO
did not validate the generated conninfo for the new owner.

As a result, ownership could be transferred to a non-superuser whose
generated connection string did not satisfy password_required=true.
The ownership change succeeded, but the subscription would fail later
when the worker or another command tried to connect.

Fix this by making ALTER SUBSCRIPTION ... OWNER TO validate the new
owner's generated conninfo with walrcv_check_conninfo().

Backpatch to v19, where SERVER subscriptions were introduced.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Yuanchao Zhang <145zhangyc@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/CAHGQGwFGa6+wWVgUmZPFwN=fBY59mYPkMK3=TxT=Pv5C1mNNRQ@mail.gmail.com
Backpatch-through: 19

doc: Improve pg_stat_recovery documentation

Improve the documentation for pg_stat_recovery in several ways:

- Mention the view in high-availability.sgml as a way to monitor
  recovery state and replay progress, alongside the existing recovery
  information functions.
- Clarify that the view returns at most one row, not exactly one row,
  and no rows to users who lack the pg_read_all_stats privilege.
- Correct the description of last_replayed_end_lsn to clarify that it
  is the end LSN of the last replayed record plus one.
- Document that replay_end_tli equals last_replayed_tli when no WAL
  record is currently being replayed.
- Clarify that current_chunk_start_time is NULL until streaming WAL
  has been received.

Backpatch to v19, where pg_stat_recovery was introduced.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAHGQGwGRavm18HqnQn_f68QB96qk6arhjET1V93OJH09Mgojkg@mail.gmail.com
Backpatch-through: 19

Fix socket_putmessage_noblock() to call socket_putmessage()

socket_putmessage_noblock() used pq_putmessage(), which redirects to
PqCommMethods->putmessage.  In the common cases, this points to
socket_putmessage(), but it would become incorrect if PqCommMethods
points to a different implementation.

This change may look like a bug, but as far as I can see this is mostly
cosmetic.  The code is able to work currently, as the repalloc() done in
the noblock() call ensures that the blocking path of internal_putbytes()
is never reached.  The issue has gone unnoticed since 2bd9e412f92b.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAO6_Xqpf5+Rzw_-XOOz-d-R5x6_2JHtpnzXP0nrYWiHyZokA_Q@mail.gmail.com

doc: Improve description of pg_stat_activity.backend_type

The documentation of pg_stat_activity used an incomplete list of values
for backend_type. While on it, it is improved to use an itemized list,
now ordered alphabetically, with a short description about each item.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-By: Michael Paquier <michael@paquier.xyz>
Reviewed-By: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/5e94c0196084f648ae6a00107125494f5804318a.camel@cybertec.at

injection_points: Clear waiter slot on error and exit

injection_wait() only clears its slot in the waiter array after the
wait loop finishes.  When the waiting query is canceled or the backend
is terminated (wait look has a CHECK_FOR_INTERRUPS), the slot leaks.
Later wakeups of the same point then bump the counter of the leaked slot
instead of the real waiter, that sleeps forever.  Repeated leaks can
exhaust all the slots.

The code is changed so as the waiting loop is wrapped with
PG_ENSURE_ERROR_CLEANUP, so as the injection point slots, that are
shared resources, can be cleaned up on ERROR as much as a FATAL.

An isolation test is added: cancel one waiter, terminate another waiter,
then check that a later waiter still receives a wakeup.  Without the
fixed code, the test would fail on timeout.

Author: Zsolt Parragi <zsolt.parragi@percona.com>
Discussion: https://postgr.es/m/CAN4CZFO+KF=cc0-iEg28RhqRBp_fTs6D4b8b7D7DB-pGYP3Ccg@mail.gmail.com
Backpatch-through: 17

Reject sequence synchronization against pre-PostgreSQL 19 publishers.

Sequence synchronization requires the page_lsn field returned by
pg_get_sequence_data(), which was added in PostgreSQL 19. Previously,
requesting sequence synchronization against an older publisher (via
ALTER SUBSCRIPTION ... REFRESH SEQUENCES or by running
ALTER SUBSCRIPTION ... CONNECTION on a disabled subscription with
sequences in the INIT state and subsequently enabling the subscription)
would cause the sequence synchronization worker to repeatedly fail with a
confusing "invalid query response" error.

Check the publisher's server version up front in both
AlterSubscription_refresh_seq() and copy_sequences(), and error out
immediately when it predates PostgreSQL 19.

Also document the PostgreSQL 19 publisher requirement for sequence
replication in the logical replication documentation and in
ALTER SUBSCRIPTION ... REFRESH SEQUENCES.

Reported-by: Noah Misch <noah@leadboat.com>
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 19
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

walsummarizer: Guard against WAL files whose tail ends are not valid.

SummarizeWAL documents that maximum_lsn should be passed as "the switch
point when reading a historic timeline, or the most-recently-measured end of
WAL when reading the current timeline." But the caller always passed the
most recently measured end-of-WAL even when reading from a historic
timeline, due to an oversight on my part. Fix that.

As far as I can determine, for this to become an issue in practice, it's
necessary to have a corrupted WAL file in the archive. SummarizeWAL checks
that every record it processes both starts and ends before switch_lsn; so if
all the WAL files in the archive are valid, SummarizeWAL will still discover
where it should stop summarizing and do the right thing. However, if
there's a corrupted file in the WAL archive, and if it is also the case that
the end of the current timeline has advanced past the switch point, then the
incorrect maximum_lsn value can result in trying to read an invalid record
and erroring out, which leads repeatedly retrying and failing with an error
every time.

One way this could occur is if a new primary is promoted and creates a
.partial file, and the user manually renames that file to remove the suffix,
and it is then archived. In that situation, the tail end of the file need
not be valid WAL, and that could lead to a stuck WAL summarizer.

Reported-by: Fabrice Chapuis <fabrice636861@gmail.com>
Analyzed-by: Thom Brown <thom@linux.com> (using claude)
Discussion: http://postgr.es/m/CAA5-nLDdvGMkN6Z-GaHGHG5T7QWEgv4YoHO7XvOJbeD00cghNg@mail.gmail.com
Backpatch-through: 17

Improve lookup_type_cache() handling on out-of-memory errors

If an error happens during the initialization of the TYPEOID catcache,
as part of lookup_type_cache(), the error handling of that lookup would
cause an assertion failure via finalize_in_progress_typentries(), called
during error recovery, the presence of an in-progress type OID causing
a catcache initialization outside of a transaction context.

The in-progress list is now delayed to happen after the initial entry
lookup. Alexander Lakhin has found a fancy way to reproduce the
problem, with the injection of probabilistic memory allocation failures.

This problem is unlikely going to show up in practice. Like the other
changes of this kind, no backpatch is done.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/95c64dc2-3abe-4f4e-b285-4c681f565d9f@gmail.com

Add new pgstats routine to split pending data setup

This commit adds pgstat_prep_pending_from_entry_ref(), a new pgstats
routine that is able to prepare an existing PgStat_EntryRef to receive
pending stats.  This split gives a way for callers to obtain first a
reference via pgstat_get_entry_ref(), then set up pending data as two
separate, distinctive, steps.

Previously, the only way to get an entry reference with pending data
ready was pgstat_prep_pending_entry(), which bundles lookup, creation,
and pending setup in a single call.  Callers that need finer control
over the entry creation had no way to attach pending data to an
already-obtained entry reference.  One case where this has shown to
matter for a stats kind is where one wants to check some capacity (for
example where a GUC bounds the maximum numer of entries allowed) before
deciding if a new entry should be created.  So this split can help in
reducing calls to pgstat_get_entry_ref(), meaning less shmem hash table
lookups.

The only logical ordering change is that pgStatPendingContext is
initialized after calling pgstat_get_entry_ref() in
pgstat_prep_pending_entry().  This does not matter in practice.

pgstat_prep_pending_entry() is refactored to use the new function
internally.  All the existing callers are unchanged.  Existing
out-of-core custom stats kinds should see no impact.

Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAA5RZ0sV+TsLejUMhAM=PJoOm8u-t8ru7B67KvyLCy=19sM87g@mail.gmail.com

doc: clarify how TIMESTAMP WITH TIME ZONE behaves

Mention "time zone conversion" as a way to clarify the time zone is not
stored in the database.

Reported-by: Richard Neill
Discussion: https://postgr.es/m/ddf41f033a8add84e1f28a095defafae@richardneill.org

Backpatch-through: 19

doc: clarify to_char("OF") HH/MM doesn't represent actual chars

Change formatting and chars to be less of a match against actual
formatting characters.

Reported-by: Phil
Discussion: https://postgr.es/m/177801333530.795.16999885814007014333@wrigleys.postgresql.org

Backpatch-through: 19

Fix typo

from commit c1fe2d1a383

Remove assertion added by commit 7dcea51c2a4d

We've got no reports of problems. Get rid of it.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Backpatch-through: 19
Discussion: https://postgr.es/m/alewd1f2G0kKeM1i@alvherre.pgsql

Unify error messages

pg_upgrade: Message wording fix

For internally consistent terminology

Message style fixes

Change DETAIL messages to conform to the style guide by capitalizing
the first word of sentences and ending sentences with a period.

Author: Peter Smith <peter.b.smith@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: vignesh C <vignesh21.gmail.com>
Reviewed-by: Xiaopeng Wang <wxp_728.163.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPszSntkUgN%2BQa9matGY6MLEoFGSuVbuKDgnnTdZ7YPRwg%40mail.gmail.com

Improve generate_partition_qual()'s cache handling on out-of-memory errors

An in-flight failure when trying to set rd_partcheckcxt or
rd_partcheck, while for example doing an allocation in copyObject(),
would leave a backend cache in a corrupted state. The operations are
now ordered so as we avoid a leak in the cache memory context and a
semi-filled cache state when an allocation failure happens.

This is unlikely going to be hit in practice. Like the other
improvements of this kind, no backpatch is done.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/95c64dc2-3abe-4f4e-b285-4c681f565d9f@gmail.com

Test what BEFORE UPDATE triggers do to FOR PORTION OF

If a BEFORE trigger changes NEW.valid_at, what is the interaction with
FOR PORTION OF?  This commit gives a test to capture our current
behavior: The trigger's change replaces the value we computed
automatically, but it does not change the bounds of the temporal
leftovers.

This matches the behavior of MariaDB.  On the other hand, DB2 rejects
changing the start/end columns of a PERIOD.  Since we don't have
PERIODs, we can't reject the change at trigger definition time as DB2
does, but we could reject it at run time by comparing the values
before and after running triggers.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/CA%2BrenyV3Cr9BvWsPeb1t8b%3DPk24apuzyGbubAEs_YsgLUTfXpg%40mail.gmail.com

Allow logical replication workers to ignore default_transaction_read_only.

Sequence synchronization updates sequence state via setval(), which
explicitly calls PreventCommandIfReadOnly(). If
default_transaction_read_only is enabled on the subscriber, this causes
sequencesync workers to fail with "cannot execute setval() in a read-only
transaction". Apply and tablesync workers are not affected, since they
write via direct heap access rather than through these read-only-checked
functions.

Rather than special-casing sequencesync, override
default_transaction_read_only to "off" for all logical replication workers
in InitializeLogRepWorker(), the same way session_replication_role and
search_path are already forced there. This keeps the initialization
uniform.

For PG-19, we kept the fix narrow by overriding
default_transaction_read_only to "off" only for sequencesync workers.

Reported-by: Noah Misch <noah@leadboat.com>
Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 19
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

Add logical decoding status to pg_control_checkpoint().

Commit 8108765f04b added the logical decoding status to the
pg_controldata output, but overlooked the pg_control_checkpoint() SQL
function, which reports the same checkpoint information. This commit
adds a logical_decoding column to pg_control_checkpoint(), placed
after full_page_writes to match the pg_controldata output order.

Oversight in 8108765f04b.

Bump catalog version.

Reported-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEkp1-1n5iC38+yHSNh955+KshwtCL6DzA0vk_vuUF_Eg@mail.gmail.com
Backpatch-through: 19

Fix recovery target test waiting on unavailable WAL

Buildfarm member skink reported a failure in
recovery/003_recovery_targets after commit d5751c33cc3. The newly
added recovery_target_xid set-then-cleared test could time out while
waiting for pg_last_wal_replay_lsn() to reach the expected LSN.

The test recorded lsn6 after calling pg_switch_wal(). As a result,
lsn6 pointed into the next WAL segment, but pg_switch_wal() only
archived the previous one. Since the standby in this test restores WAL
from the archive only, it could not obtain the segment containing
lsn6 and waited indefinitely.

Fix this by recording lsn6 before calling pg_switch_wal(), so the
archived WAL contains the LSN that the standby is waiting for.

Per buildfarm member skink.

Reported-by: Álvaro Herrera <alvherre@kurilemu.de>
Author: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/al5Y2mWffRs1NP34@alvherre.pgsql

doc: Granting TRIGGER or REFERENCES on table is dangerous.

It's always been the case that granting these privileges to users that
you don't fully trust was a bad idea, but it hasn't always been
obvious to people reading the documentation that this is the case.
To prevent confusion, and also repeated reports to pgsql-security,
mention it explicitly.

Discussion: http://postgr.es/m/CA+TgmobrjCHBuWHrvX3=2vndUCO2thUOdevrCcMDFW86cqCYvw@mail.gmail.com
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Backpatch-through: 14

Fix restore of partitions with exclusion constraints

Commit 8c852ba9a4 allowed exclusion constraints to be added to
partitioned tables, but wasn't careful to verify that pg_restore worked
correctly for them. Fix that by making CompareIndexInfo() more
selective about what needs to be rejected.

Author: Japin Li <japinli@hotmail.com>
Reported-by: Keith Paskett <keith.paskett@logansw.com>
Discussion: https://postgr.es/m/2A40921D-83AB-411E-ADA6-7E509A46F1E4@logansw.com

Avoid ERROR in recovery target GUC assign hooks

Recovery target parameters are postmaster-startup GUCs, but their
assign hooks previously did more than assign individual parameter
values. They also updated the global recoveryTarget state and raised
ERROR if more than one recovery target appeared to be set.

This was not a good fit for GUC assign hooks. Assign hooks should not
throw ERROR, and deriving cross-parameter state while individual GUCs
are still being assigned makes the result depend on assignment order
rather than the final configuration.

For example, setting one recovery target and then setting another
recovery_target_* parameter to an empty string could clear
recoveryTarget, causing recovery to proceed with no target even
though a valid target remained configured.

Fix this by having the assign hooks only store their own parameter
values. The effective recoveryTarget is now derived once from the
final recovery_target* settings in
validateRecoveryParameters(), which also rejects configurations that
specify more than one recovery target with FATAL. This preserves the
expected behavior for repeated assignments of the same GUC, treats empty
values as "not set", and removes cross-GUC validation from the assign
hooks.

Author: JoongHyuk Shin <sjh910805@gmail.com>
Reviewed-by: Greg Lamberson <greg@lamco.io>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Scott Ray <scott@scottray.io>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Henson Choi <assam258@gmail.com>
Reviewed-by: Zsolt Parragi <zsolt.parragi@percona.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CACSdjfPUa4UvKjADgOERXoxNYmCg2mqqiqKkiJk6mX6E4qgVFw@mail.gmail.com

Allow PostgreSQL::Test::Cluster::start() to pass postmaster options

Previously, tests that needed extra postmaster command-line options had
to invoke pg_ctl start directly, because
PostgreSQL::Test::Cluster::start() provided no way to pass them. That
bypassed the test framework's postmaster PID tracking, so a postmaster
could be left running if the test failed after startup.

Add an options parameter to
PostgreSQL::Test::Cluster::start(), which is passed to pg_ctl's
--options argument. This allows tests to use start() while
preserving the framework's normal cleanup behavior.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: JoongHyuk Shin <sjh910805@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEpfE0CDUUODjBt7GO9U4ZF11hqga_Ci3wP8=O49oFKVw@mail.gmail.com

Fix LSN format in REPACK worker debug message

Commit 6f6f284c7ee4 introduced use of LSN_FORMAT_ARGS across the whole
tree to remove use of manual bit-shifting, and commit 2633dae2e487
changed the printf format to be %X/%08X; however commit 28d534e2ae0a
violated both conventions by reintroducing the old manual-shift style
with the deprecated %X/%X format in one debug message. Make that new
message conform to our style.

Author: kenny <kennychen851228@gmail.com>
Backpatch-through: 19
Discussion: https://postgr.es/m/CAPXstDuWD8jg0=C8PXTXGSTTsZcjqJ+u+xKCrMpN99CXsxQzCg@mail.gmail.com

Move code to get_tables_to_repack_partitioned

Some of its code was pointlessly in its caller. This makes it better
contained and clearer.

Backpatch to 19, to avoid having two different copies in case we have to
modify it again later.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: ChangAo Chen <cca5507@qq.com>
Discussion: https://postgr.es/m/alD9l-XlCuu3eUEe@alvherre.pgsql

Improve pgstat_get_entry_ref_cached() behavior on out-of-memory errors

A failure in allocating a new cache entry in the backend-level hash
table holding references to shared stats entries would leave the table
in an inconsistent state, crash or FATAL at session exit, depending on
if there are pending stats.

Rather than leaving things in an inconsistent state on OOM, the code is
switched to use MemoryContextAllocExtended(MCXT_ALLOC_NO_OOM), so as an
allocation failure leads to a cleanup of the hash table before issuing
the allocation error.

The problem is unlikely going to show up in practice, so no backpatch is
done. Note that shared memory is not impacted, only a backend-level
hash table.

Reported-by: Alex Masterov <amasterov@gmail.com>
Discussion: https://postgr.es/m/CA+8z=zumV9sscgK=j1Es+-564maVoO9CMDdB9CsW9=FCziCj3w@mail.gmail.com

Fix RLS checks for FOR PORTION OF leftover rows

UPDATE/DELETE FOR PORTION OF may insert leftover rows to preserve the
parts of the old row that are outside the target range.  Those inserts
go through ExecInsert(), which checks RLS policies using
WCO_RLS_INSERT_CHECK.

However, the rewriter only added RLS WITH CHECK options for the
original statement command.  For UPDATE, that meant only
WCO_RLS_UPDATE_CHECK options were available, so ExecInsert() skipped
them.  For DELETE, no RLS WITH CHECK options were added at all.  As a
result, leftover rows could be inserted even when they violated INSERT
RLS policies.

Fix this by adding INSERT RLS WITH CHECK options for UPDATE/DELETE FOR
PORTION OF target relations.  Also add regression coverage for both
UPDATE and DELETE, including cases where allowed leftovers still
succeed and disallowed leftovers are rejected.

Author: Chao Li <lic@highgo.com>
Co-authored-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/6C34A987-AC50-4477-BD71-2D4AFEE1A589%40gmail.com
Discussion: https://www.postgresql.org/message-id/flat/CAJTYsWWdeBkoH5g8D-k9LDw9ciqsMxb21EJSiFXAzP4J%3DXyxOQ%40mail.gmail.com

Handle concurrent sequence refreshes.

'ALTER SUBSCRIPTION ... REFRESH SEQUENCES' can race with a running
sequence synchronization worker. If the worker has fetched a sequence's
value from the publisher but not yet marked it READY, a concurrent refresh
that resets the sequence to INIT can be overwritten by the worker's stale
value, silently losing the refresh request.

Handle this by stopping any running sequence sync worker before resetting
the sequences to INIT. This is race-free because AlterSubscription()
already holds AccessExclusiveLock on the subscription object. That lock
blocks a running worker's UpdateSubscriptionRelState(), which takes
AccessShareLock on the object, and also any worker the apply worker
re-launches, because a new worker takes AccessShareLock on the object in
InitializeLogRepWorker() before it reads pg_subscription_rel. Such a
worker cannot act on the sequence states until the refresh commits, by
which time they are reset to INIT and it will synchronize the latest
publisher values.

Reported-by: Noah Misch <noah@leadboat.com>
Author: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Backpatch-through: 19
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

Skip unnecessary get_relids_in_jointree() when there are no PHVs

Commit 1df9e8d96 made remove_useless_result_rtes() compute the set of
baserels in the jointree, to pass down to the find_dependent_phvs()
checks. But those checks are no-ops when the query contains no PHVs,
since find_dependent_phvs() and find_dependent_phvs_in_jointree() both
return early in that case. So we can avoid the
get_relids_in_jointree() scan altogether when root->glob->lastPHId is
zero, leaving baserels as NULL.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAMbWs49H275KzgZr3Cd1Hy+6Lmwp35bZ+5PrVc62k3HDLj6hNQ@mail.gmail.com
Backpatch-through: 16

Run nbtree test module tests under autoconf builds

Commit 1e4e5783e added the src/test/modules/nbtree test module, but only
registered it in the meson build, not in the module list in
src/test/modules/Makefile. As a result, autoconf builds never ran the
module's tests.

To fix, add the module to the Makefile's lists of
injection-point-dependent modules.

Oversight in commit 1e4e5783e.

Author: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Michael Paquiër <michael@paquier.xyz>
Discussion: https://postgr.es/m/CAH2-Wz=JchiD5ksiT35p8Ar02gaNv8_y6w2wBAST+Zzen-eNjw@mail.gmail.com
Backpatch-through: 19

Fix parsing of underscores in pg_plan_advice occurrence numbers

The pg_plan_advice scanner recognizes underscores as digit separators
just like the core parser, but used strtoint() to convert occurrence
numbers which does not support underscores.  Consequently, advice such
as SEQ_SCAN(x#1_0) failed to parse.  Fix by using pg_strtoint32_safe()
like the core scanner, and also add regression test coverage.

This bug was independently found and reported by Lukas Fittl and Chao
Li.  Backpatch down to v19 where pg_plan_advice was introduced.

Author: Chao Li <lic@highgo.com>
Co-authored-by: Daniel Gustafsson <daniel@yesql.se>
Reported-by: Lukas Fittl <lukas@fittl.com>
Reported-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Discussion: https://postgr.es/m/22E2ECE0-B768-43D5-8575-61C3EBC2E4E8@gmail.com
Discussion: https://postgr.es/m/CAP53PkzKeD=t90OfeMsniYrcRe2THQbUx3g6wV17Y=ZtiwmWTQ@mail.gmail.com
Backpatch-through: 19

Remove redundant null-treatment check in window function dedup.

Commit 25a30bbd423 (IGNORE NULLS / RESPECT NULLS for window functions)
made ExecInitWindowAgg() treat two otherwise-equal window functions as
duplicates only when their ignore_nulls settings also matched:

    if (i <= wfuncno && wfunc->ignore_nulls == perfunc[i].ignore_nulls)

That extra term reads WindowStatePerFuncData.ignore_nulls, but the field
was never populated when a per-function entry was filled in, so it stayed
zero from palloc0_array().  Consequently a duplicate call carrying
IGNORE NULLS or an explicit RESPECT NULLS never matched an identical
earlier entry and was needlessly given its own per-function slot and
evaluated twice.  (Results stayed correct; this was a missed sharing, not
a wrong answer.)

The extra term is in fact redundant.  WindowFunc.ignore_nulls is a plain
scalar field with no pg_node_attr, so _equalWindowFunc() already compares
it; the preceding equal() call therefore never matches two WindowFuncs
that differ only in null treatment.  If equal() matches, ignore_nulls
necessarily matched too, so the term can never change the outcome, and
WindowStatePerFuncData.ignore_nulls existed only to feed it.

Rather than populate the shadow field, drop the redundant term and the
field (and adjust the now-stale comment) and let equal() do the work.
That fixes the same bug while removing the hand-maintained duplicate
state that caused it, so it cannot silently drift again.

Author: Chao Li <li.evan.chao@gmail.com>
Co-authored-by: Ewan Young <kdbase.hack@gmail.com>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://postgr.es/m/5D2C9081-5DFE-4E27-AB14-7358238EA1BC%40gmail.com
Backpatch-through: 19

Further cleanup for commit 54cd6fc83.

Commit 54cd6fc83 set the version argument for the stats-import functions
introduced by that commit, which is of type int, using UInt32GetDatum,
not Int32GetDatum. This would be completely harmless as it's positive
and currently ignored in the functions, but let's fix that code to use
Int32GetDatum for consistency.

Author: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/CAPmGK14aremJGrPezVwFqWt7dnrMhD3KF1DgzsRygAUPETBU7w%40mail.gmail.com
Backpatch-through: 19

Fix edge case in remove_useless_result_rtes() with outer joins.

find_dependent_phvs() and find_dependent_phvs_in_jointree() decide
whether a PlaceHolderVar depends on the RTE_RESULT rel we're
considering removing by comparing the PHV's phrels to a singleton set
containing that rel's RT index, reasoning that if phrels contains any
other relid bits then those define an appropriate place where we can
evaluate the PHV.  But since this code was originally written, we've
redefined phrels to include outer-join relids, and that breaks this
logic, potentially allowing us to remove an RTE_RESULT that leaves no
valid place to evaluate the PHV.  The planner doesn't throw an error
when that happens, but it does produce an incorrect plan that will not
replace the PHV's value with NULL when needed.

In the known test case for this bug, the "extra" OJ relid is one that
we've actually decided to remove but haven't yet cleaned out of the
query's PHVs.  It's not entirely clear though that that would always
be the case.  Let's restore this code to the way it was designed to
work, by considering only base relids within the PHV's phrels.

Bug: #19553
Reported-by: Viktor Leis <leis@in.tum.de>
Author: Matheus Alcantara <matheusssilv97@gmail.com>
Co-authored-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19553-4561747f93f368a7@postgresql.org
Backpatch-through: 16

Restore the ability to use | and -> as prefix operators.

Commit 2f094e7ac changed the parser to treat these as built-in
operator names, where before they were just generic Op. While
it correctly gave them the same precedence as Op and added new
productions to allow them to still be used as infix operators,
it missed allowing them to still be used as prefix operators.
At least one extension expects to be able to do that, so add
the necessary productions.

Bug: #19558
Reported-by: Pierre Senellart <pierre@senellart.com>
Author: Pierre Forstmann <pierre.forstmann@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19558-ad1fca59a3a471a0@postgresql.org
Backpatch-through: 19

Fix REASSIGN OWNED for subscriptions in other databases.

Subscription objects are conceptually database-local objects, but
pg_subscription is a shared catalog so that the launcher process can
scan it.

Check readers of pg_subscription to ensure that, unless it's the
launcher process, it filters by MyDatabaseId. Most readers were
already doing so, but this commit fixes REASSIGN OWNED and adds guards
to catch other problems in the future. Also, clarify documentation.

Author: Dilip Kumar <dilipbalaut@gmail.com>
Reported-by: Noah Misch <noah@leadboat.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Discussion: https://postgr.es/m/20260710192533.4f.noahmisch@microsoft.com
Backpatch-through: 19

Revert "Reject concurrent sequence refreshes".

This reverts commit f38afa4abb04e85530c94b88daf11c089375daca.

That commit fixed a race that could leave stale sequence values on the
subscriber after 'ALTER SUBSCRIPTION ... REFRESH SEQUENCES'. It did so by
raising an ERROR during 'ALTER SUBSCRIPTION ... REFRESH SEQUENCES'
whenever a sequence synchronization worker was already running for the
subscription.

That approach caused intermittent buildfarm failures, because the existing
tests did not ensure the sequencesync worker had stopped before executing
'ALTER SUBSCRIPTION ... REFRESH SEQUENCES'. While discussing how to fix
the tests, we concluded that blocking the command while a sequencesync
worker is running is inconvenient for users. So we will fix the original
race differently in a follow-up commit.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 19
Discussion: https://postgr.es/m/3614163.1784163070@sss.pgh.pa.us
Discussion: https://postgr.es/m/20260710045217.f0.noahmisch@microsoft.com

Generate unicode_limits.h.

To ensure we do not overflow a size_t while case mapping on a 32-bit
platform, we need to know the maximum amount a UTF8 string can expand.

Calculate that maximum while generating unicode tables as a part of
the update-unicode target, and output into a new header
unicode_limits.h. Minor refactoring along the way.

Add a StaticAssertDecl to check that a MaxAllocSize text value
expanded by that amount would still have a length that fits in
size_t. (We couldn't actually create a new text value out of that, but
we still need to avoid overflow.)

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3213927.1783950167@sss.pgh.pa.us

Fix yet another portability problem in new NLS test.

Álvaro reported offlist that his machine was passing the new-in-v19
nls.sql test in "make check" but not in "make installcheck". On
investigation, the cause turned out to be that he has LANGUAGE set in
his environment, and with (at least recent versions of) glibc that
overrides LC_MESSAGES and friends, as per previous research by Bryan
Green. "make check" works because pg_regress unsets LANGUAGE before
starting the postmaster, but in installcheck mode we're exposed to
the prevailing value and we lose.

We're already hacking the value of LANGUAGE in this test for Solaris,
so let's just extend that to unsetting LANGUAGE on every other platform.

Reported-by: Álvaro Herrera <alvherre@kurilemu.de>
Diagnosed-by: Andrew Dunstan <andrew@dunslane.net>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/a337896e-5bff-490b-afc9-c545f06c014c@gmail.com
Backpatch-through: 19

Turn visibilitymap_clear() Assert back into an error

Commit ed62d26caca fixed a bug in clearing the visibility map and, while
doing so, made some incidental changes to visibilitymap_clear(). One of
them replaced the error thrown when the wrong buffer is passed to
visibilitymap_clear() with an Assert().

While anyone adding a new visibilitymap_clear() caller should be running
assert-enabled builds, visibilitymap_set() still reports the same
wrong-buffer condition with an elog(ERROR), so visibilitymap_clear()
should also do so for consistency. This change was also unrelated to
the bug fix and is better made as a separate commit. Restore the error.

Reported-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwF_PzOv5y7ucFh7Fqqqa8ar83zYwvWugqarRD6%2B7GCtEQ%40mail.gmail.com
Backpatch-through: 19