git.ipfire.org Git - thirdparty/postgresql.git/log

Fix null-pointer crash in ECPG compiler.

When compiling a DECLARE section containing a union nested
inside a struct, ecpg passes a null value for struct_sizeof to
ECPGmake_struct_type. I (tgl) didn't foresee that case in
commit 0e6060790, and wrote an unprotected mm_strdup() call.

Reported-by: iMSA (via Jehan-Guillaume de Rorthais <jgdr@dalibo.com>)
Author: Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20260625114849.34b2148e@karst
Backpatch-through: 18

Message and comment wording fixes

Some parts of pg_upgrade referred to "on the old cluster" etc. Change
that to "in the old cluster", matching existing style.

doc: Some spell checking

Fix options listing of pg_test_timing --cutoff

The new pg_test_timing --cutoff option (commit 0b096e379e6) appeared
out of order in the documentation and the code. Fix that.

pg_stat_io: Don't flag extends by autovacuum launcher

pg_stat_io asserts on unexpected combinations of backend type and IOOp.
These combinations were meant to help detect bugs given our current
understanding of the system -- not serve as a set of rules for what is
allowed. The autovacuum launcher scans catalog tables and may on-access
prune them. This previously wouldn't have led to any extends of the
relation, but now that on-access pruning may pin a page of the
visibility map (4f7ecca84ddacbce27), scanning tables may lead to
extending the visibility map. This would cause the launcher to trip an
assert. Since there is no reason to forbid the launcher from doing
extends, remove it from the list of backend type pgstat_tracks_io_op
flags for doing IOOP_EXTEND.

Read-only catalog scans still don't let pruning set the VM; doing so
needs table AM API changes and is left for the future.

Reported-by: Ewan Young <kdbase.hack@gmail.com>
Discussion: https://postgr.es/m/CAON2xHNOyaN9MCZohhD_NL6as3QVhGA0SOn2Hyi9w6+Y-_1bFA@mail.gmail.com

psql: Add tab completion for subscription wal_receiver_timeout

Commit fb80f388f added wal_receiver_timeout as a CREATE/ALTER
SUBSCRIPTION option, but psql tab completion did not include it in the
subscription option lists.

Add wal_receiver_timeout to completion for CREATE SUBSCRIPTION ... WITH
and ALTER SUBSCRIPTION ... SET.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/BBC5628A-63C0-4436-B8F3-90AF59BBEB73@gmail.com

Remove extraneous newlines from guc_parameters.dat

In commit fce7c73fb, two unnecessary newlines were kept: before
archive_command and seq_page_cost. Remove them here just to be
tidier.

Author: Anton Voloshin <a.voloshin@postgrespro.ru>
Discussion: https://www.postgresql.org/message-id/270ae9e7-85c6-487d-b02b-a994af56710b%40postgrespro.ru

Distinguish datacheckums worker invocations more reliably

In some corner cases, a new datachecksums worker could be launched
while an old one was still running.  If you're really unlucky, the old
worker could set the worker_result in shared memory and mislead the
launcher to think that a newer worker invocation completed
successfully, even though it failed for some reason.  That's highly
unlikely to happen in practice as it requires several race conditions
with workers and launchers starting, failing and succeeding and at the
right moments.  Nevertheless, better to tighten it up.

To distinguish different worker invocations, assign a unique
'worker_invocation' number every time a new worker is launched.  In
the worker, check that the invocation number matches before setting
the worker result.  This ensures that the result always belongs to the
latest invocation.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/b283fbb9-298e-4953-9120-eefaf24fae20@iki.fi

Minor cleanup around checking datachecksum worker result

Rename the 'success' field in DataChecksumState to 'worker_result'.
That's more appropriate when it's not a simple boolean.

Don't access the field after releasing the lock in ProcessDatabase().
No other process should be modifying it, but if we bother to do any
locking in the first place, let's do it right.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/b283fbb9-298e-4953-9120-eefaf24fae20@iki.fi

Avoid leaving DataChecksumState->worker_pid to an old value

It might be left to an old value if the launcher was terminated while
a worker was running.  launcher_exit() sends SIGTERM to the worker,
but did not clear 'worker_pid'.  Clear it, to be tidy.

Also clear it in ProcessDatabase() before starting a new datachecksums
worker, to be sure we start from a clean slate.  The codepath where
WaitForBackgroundWorkerStartup() returns BGWH_STOPPED but
worker_result != DATACHECKSUMSWORKER_SUCCESSFUL didn't clear it, while
all other codepaths did clear or set it.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/b283fbb9-298e-4953-9120-eefaf24fae20@iki.fi

Misc cleanup in datachecksums_state.[ch]

Move DataChecksumsWorkerResult struct to the .c file.  It's not used
anywhere else since commit 07009121c2 removed the injection point test
code that the comment referred to.

Mark StartDataChecksumsWorkerLauncher() as static, since it's not
called from outside the .c file.  The DataChecksumsWorkerOperation
struct can then be moved into the .c file too.

Clarify the comment on StartDataChecksumsWorkerLauncher().  It said
"Main entry point for datachecksumsworker launcher process", but I
found that misleading.  That description would be a better fit for
DataChecksumsWorkerLauncherMain(), which is the process's "main"
function, rather than StartDataChecksumsWorkerLauncher().

Fix comment on WaitForAllTransactionsToFinish() on postmaster death.
The comment claimed that it sets "the abort flag" on postmaster death,
but it actually just errors outs.  Improve the comment to explain why
it doesn't just use WL_EXIT_ON_PM_DEATH.

Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/b283fbb9-298e-4953-9120-eefaf24fae20@iki.fi

Fix set of typos and grammar mistakes

This is similar to d3bba0415435, batching all the reports of this type
received since the last batch. This covers typos and inconsistencies
for the most part.

The user-visible documentation change impacts only HEAD.

Refine error reporting for null treatment on non-window functions

Commit 4e5920e6de8 disallowed RESPECT NULLS/IGNORE NULLS on
non-window functions, but it also caused the parser to check for
that clause too early in some cases. As a result, calls such as a
nonexistent function with IGNORE NULLS no longer reported the more
helpful "function ... does not exist" error, and aggregate functions
used as window functions reported "only window functions accept ..."
instead of the more accurate aggregate-specific error.

This commit moves the RESPECT NULLS/IGNORE NULLS checks so that
helpful existing errors are preserved where appropriate. This restores
"function ... does not exist" for nonexistent functions, while still
reporting that plain functions are not window functions and that
aggregates do not accept null treatment.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CAHGQGwH7VY_0GkhycyYZ4czkPGL0uGzDyOxk3uuFOSRR7wFY3g@mail.gmail.com

plperl: Fix NULL pointer dereference for forged array object

In get_perl_array_ref(), for a PostgreSQL::InServer::ARRAY object, we
look up its "array" key with hv_fetch_string() and then inspect the
returned SV.  However, hv_fetch_string() returns a NULL pointer when
the key is absent, and the code dereferenced that result without first
checking whether the pointer itself was NULL.  As a result, a plperl
function returning a forged PostgreSQL::InServer::ARRAY object that
lacks the "array" key would crash the backend with a segmentation
fault.

Fix this by checking the pointer returned by hv_fetch_string() before
dereferencing it, matching how other callers in this file already
guard the result.  With the check in place, such an object falls
through to the existing error report instead of crashing.

Author: Xing Guo <higuoxing@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CACpMh+DYgcnqZwQLXXuxQcehJTd7T8UmKWSLsK4mFBEp9G2ajA@mail.gmail.com
Backpatch-through: 14

Re-index ModifyTable FDW arrays when pruning result relations

ExecInitModifyTable() rebuilds the per-result-relation lists after
dropping result relations removed by initial runtime pruning. The
re-indexing was done for withCheckOptionLists, returningLists,
updateColnosLists, mergeActionLists and mergeJoinConditions, but
fdwPrivLists and fdwDirectModifyPlans were missed. As a result, a
kept foreign result relation could be handed the wrong fdw_private,
or ri_usesFdwDirectModify could be set from the wrong plan index,
leading to wrong behavior or a crash in BeginForeignModify() and in
the direct-modify path.

show_modifytable_info() had the same problem: it indexed the
plan-ordered node->fdwPrivLists with the post-pruning executor
position, so once initial pruning removed a result relation it
could read a different relation's fdw_private (often a NIL entry),
producing wrong EXPLAIN output or a crash.

Fix by re-indexing fdwPrivLists and fdwDirectModifyPlans alongside
the other lists, saving the re-indexed private lists in
ModifyTableState.mt_fdwPrivLists and reading from there in both
nodeModifyTable.c and explain.c.

Reported-by: Chi Zhang <798604270@qq.com>
Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Author: Rafia Sabih <rafia.pghackers@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/19484-a3cb82c8cde3c8fa%40postgresql.org
Backpatch-through: 18

Nail pg_parameter_acl in relcache.

Previously, a parameter specified in the startup packet for a physical
replication connection could encounter an error trying to perform an
ACL check for the setting.

Problem was introduced in a0ffa885e4, but no reasonable back-patchable
solution was found, so fixing only in master.

Bumps catversion.

Discussion: https://postgr.es/m/d8f8e11f06d692fff89e6be0f22732d30cf695a0.camel%40j-davis.com
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Mark Dilger <mark.dilger@enterprisedb.com>

Fix incorrect declarations of variadic pg_get_*_ddl() functions.

The final parameter of an ordinary variadic function should be an
array type.  CREATE FUNCTION won't accept a declaration that isn't
like that, but it's possible to put an incorrect combination into a
pg_proc.dat entry.  Sadly, the opr_sanity test that was supposed to
check that is broken and does not report functions with non-array
final parameters.  This allowed exactly such a thinko to sneak into
the recently-added pg_get_*_ddl() functions: their last argument
should be declared text[] but was declared text.  (We'd probably
have noticed eventually, when somebody tried to actually pass a
variadic array to one of those functions.  But their regression
tests do not do that.)

Fix those functions, and fix the opr_sanity test so we'll notice
next time.  Bump catversion for new pg_proc contents.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/D41A334E-ED9E-42EE-830D-28D4D36E9317@gmail.com

psql: Tighten heuristics for BEGIN/END within CREATE SCHEMA.

Since d51697484, psql's scanner treats CREATE SCHEMA as a command that
may contain SQL-standard routine bodies, so that semicolons inside
BEGIN ATOMIC ... END blocks do not terminate the command too early.
However, the code counted BEGIN/END throughout CREATE SCHEMA, so that
it could be fooled by valid (and previously accepted) code such as

    CREATE SCHEMA s CREATE VIEW begin AS SELECT 1;

Improve this by explicitly checking whether each CREATE sub-clause is
CREATE [OR REPLACE] {FUNCTION|PROCEDURE}, and only counting BEGIN/END
within those clauses.  Since CREATE FUNCTION/PROCEDURE wasn't allowed
in CREATE SCHEMA before d51697484, this will not risk failure on any
cases that worked before v19.

There remain cases that fool the top-level CREATE FUNCTION/PROCEDURE
heuristic and thus also the CREATE SCHEMA case, for example

    CREATE FUNCTION begin () ...

But that's been true all along with no field complaints, so we'll
leave that issue for another day.

In the name of keeping things readable, move the logic supporting
this out of the {identifier} flex rule and into some small new
subroutines.  Also rename existing related PsqlScanState fields
to help distinguish them from the added fields.

This patch also fixes what seems to me (tgl) a small bug: \;
would reset BEGIN/END detection even when inside parens or BEGIN.
That's unlike what a plain semicolon would do, and no such effect
is suggested by the documentation.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/8E03BB8D-003D-4850-9772-5F8015A5A0C7@gmail.com

doc: Describe better handling of indexes in ALTER TABLE ATTACH PARTITION

When ALTER TABLE ... ATTACH PARTITION matches partition indexes to the
parent table's indexes, invalid indexes are skipped. This commit
improves the documentation to describe what e90e9275f56 has changed:
invalid indexes are skipped, and only valid indexes are considered for a
match.

Author: Mohamed Ali <moali.pg@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAGnOmWpAMaE-BOkpwM6mJnHcpS2QZ8yLSSaqmz+vryEsbCWWWA@mail.gmail.com
Backpatch-through: 14

Readable identity strings for property graph objects

The "identity" column of pg_identify_object() for property graph
objects can be long string of names connected by "of", e.g. "a of l of
e of g".  The type of the first named object is given by column
"type".  But the types of intermediate objects are not easy to find
from the identity string especially when some of them share the same
name.  Some objects, like user mappings or authorization identifier
members, add types of objects other than the first one in the identity
string.  Do the same for property graph objects.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/aej1DkLwhyZWmtxJ%40bdtpg

doc: Update pg_dump/dumpall/upgrade about handling of external statistics

The pages of pg_dump, pg_dumpall and pg_upgrade mentioned that their
--no-statistics and --statistics options did not include the handling of
statistics created by CREATE STATISTICS, which was wrong.

Oversight in c32fb29e979d.

Reported-by: Igi Izumi <igi@sraoss.co.jp>
Discussion: https://postgr.es/m/19529-c7eb1e7a0b07eae6@postgresql.org

Fix unsafe order of operations in ResourceOwnerReleaseAll().

This function called the resource-kind-specific ReleaseResource()
method for each item before deleting that item from the resowner.
That's backwards from the ordering in ResourceOwnerReleaseAllOfKind,
and it's not very safe.  If ReleaseResource throws an error then the
subsequent abort cleanup will come back here and try to release that
item again, possibly leading to a double-free or similar crash,
and in any case risking an infinite error cleanup loop.  This mistake
explains why the pgcrypto bug just fixed in 80bb0ebcc led to a crash
rather than something more benign.

Remove the item from the resowner, then call ReleaseResource,
matching the way things were done before b8bff07da.  If there
is a problem of this sort, we'd prefer to leak the item than
suffer the other likely consequences.

Per further analysis of bug #19527.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/646741.1782157515@sss.pgh.pa.us
Backpatch-through: 17

pgcrypto: avoid recursive ResourceOwnerForget().

Raising an error within a function using an OSSLCipher object led
to a complaint from ResourceOwnerForget and then a double-free crash,
because ResOwnerReleaseOSSLCipher forgot to unhook the OSSLCipher
object from its owner. (The sibling logic for OSSLDigest objects got
this right, as did every other ReleaseResource function AFAICS.)

Oversight in cd694f60d.

Bug: #19527
Reported-by: Yuelin Wang <3020001251@tju.edu.cn>
Author: Yuelin Wang <3020001251@tju.edu.cn>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19527-6e7686960c6dce78@postgresql.org
Backpatch-through: 17

Strip removed-relation references from PlaceHolderVars at join removal

When left-join removal deletes a relation, remove_rel_from_query()
updates the relid sets attached to RestrictInfos and
EquivalenceMembers, and the canonical PlaceHolderVar held in each
PlaceHolderInfo, but it does not rewrite the PlaceHolderVars embedded
in clause and EquivalenceClass member expressions.  That has been
fine, because later processing consults those relid sets rather than
the embedded PlaceHolderVars.

However, such an expression may afterwards be translated for an
appendrel child and have its relids recomputed from scratch by
pull_varnos().  If the embedded PlaceHolderVar's phrels still mentions
the removed relation, pull_varnos() folds it back in, so the rebuilt
clause's relids reference a no-longer-existent relation.  That yields
a parameterized path keyed on the removed relation, tripping the
Assert on root->outer_join_rels in get_eclass_indexes_for_relids().

Fix by stripping the removed relids from the PlaceHolderVars in
surviving rels' baserestrictinfo and in EquivalenceClass member
expressions, keeping them consistent with the canonical
PlaceHolderVars.

This is only reachable on v18 and later, where
match_index_to_operand() began ignoring PlaceHolderVars; before that,
the wrapping PlaceHolderVar prevented the index match that exposes the
stale relids.

Reported-by: Alexander Kuzmenkov <akuzmenkov@tigerdata.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CALzhyqwryL2QywgO03VQr_237Sq3MEVgTTT2_A9G3nGT5-SRZg@mail.gmail.com
Backpatch-through: 18

plpython: Use funccache.c infrastructure for procedure caching.

PL/Python set-returning functions can crash with a use-after-free when
CREATE OR REPLACE FUNCTION is executed while the SRF is mid-iteration.
The crash occurs because srfstate->savedargs is allocated in proc->mcxt,
which gets deleted when the procedure is invalidated, leaving a dangling
pointer that PLy_function_restore_args() then dereferences.

The best fix is to use reference counting to prevent destroying the
function state while it's still in use, similar to what PL/pgSQL has
done.  Rather than inventing a new wheel, this commit converts
PL/Python to use the funccache.c infrastructure.

The main challenge is that PL/Python uses SFRM_ValuePerCall for SRFs,
where the handler is called multiple times.  A naive implementation
would allow the refcount to return to zero between calls, but we need
to hang onto the original state and function body.  SQL-language
functions face the same challenge, so this commit follows the same
approach used in functions.c: maintain a per-call-site cache struct
(PLyProcedureCache) in fn_extra that holds both the pointer to the
long-lived PLyProcedure and the SRF execution state.

The use_count is incremented when we first obtain the procedure and is
decremented via a MemoryContextCallback registered on fn_mcxt, which runs
even during error aborts. Cleaning up the per-call SRF state needs more
care: an ExprContextCallback handles the in-query cases, since the
iterator is not guaranteed to run to completion (for example a LIMIT or a
rescan can abandon it early). But unlike SQL functions, whose resources
are released by transaction abort, PL/Python holds Python reference counts
on the iterator and saved arguments that abort will not release, and
ExprContextCallbacks are not invoked during an error abort. The
MemoryContextCallback on fn_mcxt therefore doubles as the backstop that
releases those references when a query errors out mid-iteration.

Since fn_extra is now used for PLyProcedureCache, this commit removes
use of the funcapi.h SRF infrastructure (SRF_IS_FIRSTCALL,
SRF_RETURN_NEXT, etc.) and switches to direct isDone signaling via
ReturnSetInfo, matching how SQL functions handle ValuePerCall mode.

This fixes a longstanding bug, so ideally we'd back-patch it.  But
it'd be impractical to back-patch further than v18 where funccache.c
came in.  The patch is somewhat invasive, and the bug only arises in
very uncommon usages (which is why it evaded detection for so long).
On the whole, the risk/reward ratio for putting this into v18 doesn't
seem good, so commit to master only.

Bug: #19480
Reported-by: Andrzej Doros <adoros@starfishstorage.com>
Author: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19480-f1f9fdce30462fc4@postgresql.org

doc: Clarify pg_get_sequence_data() privileges and NULL results

The documentation for pg_get_sequence_data() did not match the
function's behavior. It stated that either USAGE or SELECT privilege
was sufficient, but the function returns sequence data only when the
caller has SELECT privilege.

The documentation also did not explain that the function returns a row
containing all NULL values when sequence data cannot be returned, such
as when the sequence does not exist or the caller lacks the required
privilege.

Update the documentation to reflect the actual behavior, including the
required privilege and the result returned when sequence data is
unavailable.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://postgr.es/m/CAHGQGwGNTaXnBKUV510_P1KwhdbHT+kgZ4zU5njBHy7nCqdhzg@mail.gmail.com

Fix misreporting of publisher sequence permissions during sync

When synchronizing sequences for logical replication, a
publisher-side permission failure could be reported as if the sequence
were missing on the publisher, making the real cause harder to
identify.

This happened because pg_get_sequence_data() returns a row of NULL
values when the replication connection lacks permission to read a
sequence. Sequence synchronization treated that the same as a missing
sequence, causing it to emit a misleading "missing sequence on
publisher" warning.

Fix this by distinguishing permission failures from genuinely missing
sequences. The synchronization query now checks whether the
replication connection has the required privilege for each published
sequence, allowing the worker to report permission failures
separately.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Tristan Partin <tristan@partin.io>
Discussion: https://postgr.es/m/CAHGQGwGNTaXnBKUV510_P1KwhdbHT+kgZ4zU5njBHy7nCqdhzg@mail.gmail.com

Make type cache initialization more resilient on re-entry after OOM

An out-of-memory failure while initializing the type cache hash tables
would issue an ERROR and leave a backend in a partially inconsistent
state.  Without assertions, the server would crash with a NULL pointer
dereference on initialization re-entry when doing a type lookup due to
one or both hash tables missing.  An assertion would trigger if these
are enabled in the build.

This commit changes the ordering of the type cache initialization to
become more robust on re-entry after an in-flight allocation failure:
- The two hash tables are initialized first, and can only be initialized
once.
- The initialization is considered as done once the in-progress list is
allocated in the CacheMemoryContext.  This is now the last allocation
step.
- Last, the callbacks are registered.  These can only fail with a FATAL
error, taking down the process so leaving the process in a non-complete
state is fine.

This is in the same spirit as b85f9c00fb88 and 29fb598b9cad, where
random allocation failures can make the backend go crazy in the code
paths fixed due to the static states becoming inconsistent.  Like the
other fixes, this is unlikely going to show up in practice, so no
backpatch is done.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/e77acaac-a1b3-40b3-99ee-5769b4e453e4@gmail.com

Make StandbyAcquireAccessExclusiveLock() more resilent with OOMs

In StandbyReleaseXidEntryLocks, a failure in acquiring a lock with
LockAcquire() due to an out-of-memory problem would lead to an
inconsistency with the lock state cached in the startup process,
impacting the list of RecoveryLockXidEntrys.  The code is updated here
so as the cached state is updated once the lock is acquired.

This problem is unlikely going to happen in practice.  Even if it were
to show up, it would translate to a LOG message for non-assert builds
(assertion failure otherwise), so no backpatch is done.  This commit is
in the same spirit as 29fb598b9cad, with a problem emulated by injecting
random failures for allocations.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/e77acaac-a1b3-40b3-99ee-5769b4e453e4@gmail.com

Make pg_mkdir_p() tolerant of a concurrent directory creation.

pg_mkdir_p creates each missing path component with a stat() followed
by mkdir().  If the stat() reports the component as absent but another
process creates it in the window before this process's mkdir(), mkdir()
fails with EEXIST and pg_mkdir_p treated that as a hard error -- unlike
"mkdir -p", which is meant to be idempotent and race-tolerant.

This shows up when several processes concurrently create paths that
share an ancestor directory: for example, parallel initdb runs whose
data directories live under a common temporary directory.  One process
wins the race to create the shared ancestor and the others fail with
    could not create directory "...": File exists

Fix this race condition by first trying mkdir() and only attempting
stat() if it fails with EEXIST.

On Windows, there's an additional problem: stat() opens a file handle
and participates in share-mode locking, which means it can transiently
fail on a directory another process is concurrently creating.  Use
GetFileAttributes() instead: it requests only FILE_READ_ATTRIBUTES
and is exempt from share-mode denial, so it reliably sees a
concurrently-created directory.

I (tgl) also chose to back-patch 039f7ee0f's effects on this function,
so that pgmkdirp.c remains identical in all live branches.

Author: Andrew Dunstan <andrew@dunslane.net>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3ca004de-e49b-4471-b8aa-fd656e70f68c@dunslane.net
Backpatch-through: 14

Update JIT tuple deforming code for virtual generated columns

The JIT deforming code contains an optimization that determines which
columns are guaranteed to exist in the tuple. That's used to allow
skipping of reading the tuple's natts when the code only needs to deform
attributes that are guaranteed to always exist in all tuples. 83ea6c540
missed updating this code to account for VIRTUAL generated columns.
These are stored as NULLs in the tuple, but may be defined as NOT NULL.
This could result in the code thinking more columns are guaranteed to
exist than actually do.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/1151393.1781734980@sss.pgh.pa.us

Fix comments on data checksum cost settings

The cost parameters for the data checksums worker can be updated by the
user issuing a repeated enable checksum command, but the comments on the
struct members hadn't been updated to reflect this and were out of date.
Another part of the same comment needed better wording to be readable.

Also wrap the reading of the parameters in a lock, there is no live
bug due to not using a lock but it's still the right thing to do.

Author: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reported-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/2176020b-ecbc-438b-9fc3-9c3593d9e6fc@iki.fi

Silence "may be used uninitialized" compiler warning.

Newer gcc warns that this "actual_arg_types" variable may be used
uninitialized, but visual inspection indicates there's no bug. To
silence the warning, initialize the variable to zeros.

Bug: #19485
Reported-by: Hans Buschmann <buschmann@nidsa.net>
Tested-by: Erik Rijkers <er@xs4all.nl>
Tested-by: Hans Buschmann <buschmann@nidsa.net>
Reviewed-by: Tristan Partin <tristan@partin.io>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/19485-2b03231a775756f1%40postgresql.org
Discussion: https://postgr.es/m/6c52a1a6612948519468d46cb224a8c4%40nidsa.net

hstore_plperl: Add CHECK_FOR_INTERRUPTS() in reference-unwinding loop.

Add CHECK_FOR_INTERRUPTS() to the while loop in plperl_to_hstore()
that dereferences chains of Perl references, so that a circular
reference (e.g. $x = \$x) can be cancelled by the user instead of
spinning indefinitely. (We looked at detecting such circular
references, but it seems more trouble than it's worth.)

This is a follow-up to da82fbb8f, which fixed the same issue in
SV_to_JsonbValue() in jsonb_plperl.

Author: Aleksander Alekseev <aleksander@tigerdata.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAJ7c6TPbjkzUk4qJ5dHvDNEz0hBuFue3A-XWz_=897z+BC+z8A@mail.gmail.com
Backpatch-through: 14

Avoid errors during DROP SUBSCRIPTION when slot_name is NONE.

Previously, if the subscription used a server,
ForeignServerConnectionString() could raise an error (e.g. missing
user mapping) during DROP SUBSCRIPTION even if the conninfo wasn't
needed at all.

Construct conninfo after the early return, so that if slot_name is
NONE and rstates is NIL, the DROP SUBSCRIPTION will succeed even if
ForeignServerConnectionString() raises an error (e.g. missing user
mapping).

If slot_name is NONE and rstates is not NIL, DROP SUBSCRIPTION may
still encounter an error from ForeignServerConnectionString().

Reported-by: Hayato Kuroda (Fujitsu) <kuroda.hayato@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/OS9PR01MB12149B54DEA148108C6FA5667F52D2@OS9PR01MB12149.jpnprd01.prod.outlook.com

doc PG 19 relnotes: fix "standard_conforming_strings = off text

Reported-by: Tom Lane
Discussion: https://postgr.es/m/1242905.1781796886@sss.pgh.pa.us

Avoid division-by-zero when calculating autovacuum MXID score.

In some cases, effective_multixact_freeze_max_age can be 0, which
presents a division-by-zero hazard for the multixact ID age score
calculation. To fix, bump it to 1 in that case so that we use the
multixact ID age as the score. While at it, also document that
this component score scales due to high multixact member space
usage.

Reported-by: Masahiko Sawada <sawada.mshk@gmail.com>
Author: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CAD21AoC6nKeYAjTvJ9dmBea03GZK9222h_O%3DONmcVuxfyO88Bg%40mail.gmail.com

doc: Fix "Prev" link, take 2.

Commit 6678b58d78 fixed a wrong "Prev" link by changing the link
generation code to use [position()=last()] instead of [last()] in
the predicate on the union of reverse axes. Unfortunately, that
caused documentation builds to take much longer. To fix, combine
the "preceding" and "ancestor" steps into one "preceding" step and
one "ancestor" step, and revert the predicate back to [last()].
The smaller union evades the libxml2 bug while avoiding the build
time regression.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1132496.1781718007%40sss.pgh.pa.us
Backpatch-through: 14

Revert non-text output formats for pg_dumpall

This reverts the non-text (custom/directory/tar) output format support
for pg_dumpall added by 763aaa06f03 and its feature-specific follow-ups,
in line with Noah Misch's post-commit review which recommends reverting
and finishing the work through the commitfest.

Scope is deliberately minimal: only the feature itself is removed.
Independent improvements that merely touched the same files, or that were
committed alongside the feature but do not depend on its design, are
preserved.

Reverted (the feature):

  763aaa06f03  Add non-text output formats to pg_dumpall
  d6d9b96b404  Clean up nodes that are no longer of use in 007_pgdumpall.pl
  01c729e0c7a  Fix casting away const-ness in pg_restore.c
  c7572cd48d3  Improve writing map.dat preamble
  3c19983cc08  pg_restore: add --no-globals option to skip globals
  abff4492d02  Fix options listing of pg_restore --no-globals
  bb53b8d359d  Fix small memory leak in get_dbname_oid_list_from_mfile()
  a793677e57b  pg_restore: Remove dead code in restore_all_databases()
  a198c26dede  pg_dumpall: simplify coding of dropDBs()
  ec80215c033  pg_restore: Remove unnecessary strlen() calls in options parsing

Preserved (independent of the feature):

  b2898baaf7e  the check_mut_excl_opts() helper in src/fe_utils/option_utils.c
               and its use in pg_dump
  7c8280eeb58  pg_dump's conflicting-option refactor (and tests 002/005)
  be0d0b457cb  pg_dumpall's rejection of --clean together with --data-only
               (re-expressed directly, since pg_dumpall.c is otherwise
               returned to its pre-feature state)
  74b4438a70b  the dangling-grantor-OID GRANT fix (back-patched through 16)
  273d26b75e7, d4cb9c37765  independent pg_restore.sgml clarifications

Because the feature restructured pg_dumpall.c and pg_restore.c (pg_restore's
main() was split into restore_one_database() plus a dispatcher) and
interleaved its option checks with the conflicting-option refactor in the
same regions, the cosmetic check_mut_excl_opts() reflow of those two files'
option blocks is inseparable from the feature and comes out with it; the
behavior is unchanged.  The reusable helper and pg_dump's use of it are
unaffected.

Discussion: https://postgr.es/m/20260607000218.96.noahmisch@microsoft.com

Create TOAST table for partitions made by MERGE/SPLIT PARTITION

ALTER TABLE ... MERGE PARTITIONS / SPLIT PARTITION builds a new
partition via createPartitionTable(), but never gives it a TOAST table.
When the source rows carried out-of-line varlena values, the move
into the new partition entered heap_toast_insert_or_update() with
reltoastrelid = InvalidOid: the externalization step is skipped, the
value falls back to inline storage and heap_insert() fails with
"row is too big" error. Also, TOAST table is needed if the new partition
receives out-of-line varlena values after the DDL operation is complete.

Call NewRelationCreateToastTable() right after the new partition is
created in createPartitionTable(), mirroring what DefineRelation()
does for regular CREATE TABLE. NewRelationCreateToastTable() decides
on its own whether a TOAST table is actually required, so partitions
with no toast-eligible columns are unaffected.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Discussion: https://postgr.es/m/ai_c4-v8iLA2kXFV%40pryzbyj2023
Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>

Make GetSnapshotData() more resilient on out-of-memory errors

If the allocation of Snapshot->subxip fails, a follow-up call of
GetSnapshotData() would see a partially-initialized snapshot, causing a
NULL dereference on reentry when using "subxip" because only "xip" would
be allocated. In the event of an out-of-memory error when allocating
"subxip", "xip" is now reset before throwing an ERROR, so as Snapshots
can be allocated and handled gracefully on retry.

This problem is unlikely going to show up in practice, so no backpatch.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Matthias van de Meent <boekewurm+postgres@gmail.com>
Discussion: https://postgr.es/m/e77acaac-a1b3-40b3-99ee-5769b4e453e4@gmail.com

Avoid stale slot access after dropping obsolete synced slots.

drop_local_obsolete_slots() continued to dereference local_slot after
calling ReplicationSlotDropAcquired(). Once the slot is dropped, its
entry in the slot array can be reused by another backend, so later reads
of local_slot->data could observe a different slot's name or database
OID, leading to an incorrect unlock and log message.

Save the slot name and database OID before performing the drop, and use
the saved values for the subsequent UnlockSharedObject() call and the log
message. While at it, emit the "dropped replication slot" message only
when a slot was actually dropped, rather than unconditionally.

Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/TY4PR01MB177184FF9EE916F577E1F554194082@TY4PR01MB17718.jpnprd01.prod.outlook.com

Fix PANIC with track_functions due to concurrent drop of pgstats entries

pgstat_drop_entry_internal() generates an ERROR if facing a pgstats
entry already marked as dropped.  With a workload doing a lot of
concurrent CALL and DROP/CREATE PROCEDURE, it could be possible for
AtEOXact_PgStat_DroppedStats(), that wants to do transactional drops, to
find entries that are already dropped, after a commit record has been
written.  In this case, ERRORs are upgraded to PANIC, taking down the
server.

This issue is fixed by making pgstat_drop_entry() optionally more
tolerant to concurrent drops, adding to the routine a missing_ok option
to make some of its callers more tolerant (spoiler: some of the callers
want a strict behavior, like replication slots and backend stats).
pgstat_drop_entry_internal() cannot be called anymore for an entry
marked as dropped, hence its error is replaced by an assertion.
Functions are handled as a special case in core; this problem could also
apply to custom stats kinds depending on what an extension does.
track_functions is costly when enabled (disabled by default), which is
perhaps the main reason why this has not be found yet.

A similar version of this patch has been proposed by Sami Imseih on a
different thread for a feature in development.  This version has tweaked
here by me for the sake of fixing this issue.

Reported-by: zhanglihui <zlh21343@163.com>
Author: Sami Imseih <samimseih@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/19520-73873648d44793cf@postgresql.org
Backpatch-through: 15

Doc: Clarify that publication exclusions track table identity.

The EXCEPT clause of a FOR ALL TABLES publication tracks each excluded
table by its identity rather than by name. As a result, renaming a table
or moving it to another schema with ALTER TABLE ... SET SCHEMA leaves the
exclusion in place, and the table stays excluded from the publication.

This behavior was not previously documented and could surprise users who
might reasonably expect a schema-qualified exclusion to apply only while
the table remains in that schema. Add a note to CREATE PUBLICATION to make
the behavior explicit.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAHut+PvQ5BqnawCQd6r1tqqd+iAJC-CuRY8wscuXSrpHGUzofA@mail.gmail.com

doc PG 19 relnotes: update to current

doc PG 19 relnotes: remove VALIDATE CONSTRAINT lock item

Reported-by: Fujii Masao
Discussion: https://postgr.es/m/CAHGQGwGjfu5n5H-jsMp6fAsf-zMJzYWDcJGti6pmr7SHf9BcpA@mail.gmail.com

Fix ALTER DOMAIN VALIDATE CONSTRAINT locking

Commit 16a0039dc0d1 reduced the lock level for ALTER DOMAIN ...
VALIDATE CONSTRAINT from ShareLock to ShareUpdateExclusiveLock.
However, that change was unsafe. If DML on tables using the domain had
already started and initialized domain constraint checks before a NOT
VALID constraint was added, it could still insert or update rows that
violated the new constraint.

This commit reverts commit 16a0039dc0d1 so that ALTER DOMAIN ...
VALIDATE CONSTRAINT once again acquires ShareLock on relations using
the domain. Also add an isolation test covering this case.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/463C0E1A-4A40-4BCA-839C-9236B80D65EE@gmail.com

doc PG 19 relnotes: mention pg_dump{all} & standard_conforming_strings

Dumps will fail if standard_conforming_strings = off.

Reported-by: Tom Lane
Backpatch-through: 1131492.1781717349@sss.pgh.pa.us

Avoid errors during ALTER SUBSCRIPTION.

Previously, when retrieving the old Subscription object, constructing
the conninfo could encounter an error during
ForeignServerConnectionString(). ACL errors were handled properly, but
other errors could interfere with a user fixing the problem with ALTER
SUBSCRIPTION.

Reported-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/D908370F-2695-4231-851D-17179A6A6F2A@gmail.com

oauth: Skip call-count test for libcurl 8.20.0

The call-count test in 001_server.pl runs into a recent upstream
regression in Curl:

https://github.com/curl/curl/issues/21547

The symptom is high CPU usage on some platforms during OAuth HTTP
requests. But it looks like the fix is on track for a June 2026 release,
as part of Curl 8.21.0, so just skip the test if we happen to be using
the broken version.

Reported-by: Andrew Dunstan <andrew@dunslane.net>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2B%3DyrwMSsHuNJ1V14isA4iSix5Xb3P3VEp1X0BS61MdV4A%40mail.gmail.com
Backpatch-through: 18

libpq-oauth: Print libcurl version with OAUTHDEBUG_UNSAFE_TRACE

When debugging an OAuth trace, it's helpful to know what version of Curl
is in use. The SSL library that Curl is using (which may not be the one
in use by libpq) is also relevant, and it's just as easy to get, so
print that too.

This is being added post-feature-freeze, with RMT approval, in order to
fix some tests in the face of an upstream Curl regression. A subsequent
commit will make use of it in oauth_validator. Backpatch to 18 as well.

Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2B%3DkP86t%2BZFFXNQ9G6K4ht7utdmB%3DCzhP%3DZ2wvuBymOTtQ%40mail.gmail.com
Backpatch-through: 18

oauth_validator: Print captured stderr after call-count failure

If the call count test fails, you'll reasonably want to know what the
network trace looked like, but that information is currently swallowed.
Print it out instead.

Backpatch-through: 18

jsonb_plperl, jsonb_plpython: Fix unguarded recursion and loops.

Add check_stack_depth() to Jsonb_to_SV, SV_to_JsonbValue,
PLyObject_FromJsonbContainer, and PLyObject_ToJsonbValue. Without
this, deeply nested JSONB values can crash the backend with SIGSEGV
instead of raising a proper error.

Also add CHECK_FOR_INTERRUPTS() to the while loop in SV_to_JsonbValue
that dereferences chains of Perl references, so that a circular
reference (e.g. $x = \$x) can be cancelled by the user instead of
spinning indefinitely. (We looked at detecting such circular
references, but it seems more trouble than it's worth.)

Author: Aleksander Alekseev <aleksander@tigerdata.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAJ7c6TPbjkzUk4qJ5dHvDNEz0hBuFue3A-XWz_=897z+BC+z8A@mail.gmail.com
Backpatch-through: 14

vacuumdb: Fix --missing-stats-only for partitioned indexes.

The current form of the catalog query picks up partitioned tables
with expression indexes that lack statistics. However, since such
indexes never have statistics, there's no point in analyzing them.
To fix, adjust the relevant part of the query to skip partitioned
tables with expression indexes. While at it, remove the nearby
stainherit check; entries for index expressions always have
stainherit = false.

Author: Baji Shaik <baji.pgdev@gmail.com>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/CA%2Bfm-RPE1tEc6CUUPDyRbYTz9tF5Kw47nnk-Zq%3DyYvanbsxyCQ%40mail.gmail.com
Backpatch-through: 18

Fix pgstat_count_io_op_time() calls passing incorrect information

Several calls of pgstat_count_io_op_time() have been used as data to
count negative values returned by pg_pread() or pg_pwrite(), leading to
an incorrect count reported, casting them back to uint64.

Most of the problematic calls updated here are adjusted so as we do not
report buggy negative numbers anymore. In xlogrecovery.c, the spot
updated still counts short reads. In xlog.c, after a WAL segment
initialization, I/O numbers are aggregated only after checking that the
operation has succeeded.

issues introduced by a051e71e28a1.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/0db864e6-4477-4eba-b2be-d3523cc86564@eisentraut.org
Backpatch-through: 18

Silence uninitialized variable warning with some compiler versions

The first "if (difffile)" block initializes the startpos variable, and
the second "if (difffile)" block reads it. The second if-condition can
only be true when the first one was true, so the startpos variable is
always initialized when it's used. However, the compiler might not be
able to deduce that, and warn about startpos being used uninitialized.

To silence the warning, rearrange the if-checks. Also, bail out if the
diff file cannot be opened, instead of ignoring it silently.

Author: Mikhail Litsarev <m.litsarev@postgrespro.ru>
Reviewed-by; Ewan Young <kdbase.hack@gmail.com>
Discussion: https://www.postgresql.org/message-id/ee06f058c626cd37babd8c81579ffb1e@postgrespro.ru

Add tuple deformation test for virtual generated columns

Add coverage for a virtual generated NOT NULL column followed by a
physically stored NOT NULL column. This exercises the tuple deformation
case fixed by 89eafad297a, where TupleDescFinalize() could incorrectly
treat a virtual generated column as part of the guaranteed physical column
prefix and compute cached offsets past it.

Without that fix, deforming the following column could read from the wrong
tuple offset.

Author: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/A4BC563C-0CA3-4EF3-952A-EA41F9E5BF1E%40gmail.com

Fix error message typo.

4e5920e6de8 added an ereport call with the primary error message
starting with upper case, which is prohibited by our error message
style guide. This commit fixes it.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://postgr.es/m/8BACA715-B9B6-479D-9153-C05F05482664%40gmail.com

Fix RI fast-path for domain-typed FK columns

The RI fast path is the first caller to pass a cross-type pf_eq_oprs
operator to ri_HashCompareOp().  Its test for whether a cast can be
skipped, "typeid == righttype", failed when the FK column was a domain,
since typeid is then the domain OID rather than its base type.  The code
concluded no usable conversion existed and threw "no conversion function
from <domain> to <type>" for every valid row.

Look through the domain to its base type.  When pfeqop comes directly
from the index opfamily its right-hand input is getBaseType(fktype), so
getBaseType(typeid) == righttype is the correct test; the PK = PK
fallback (right-hand input opcintype) still fails that test and falls
through to the existing cast lookup unchanged.

Oversight in commit 2da86c1.

Reported-by: Ewan Young <kdbase.hack@gmail.com>
Author: Ewan Young <kdbase.hack@gmail.com>
Reviewed-by: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CAON2xHNDFC4cX2atvTpMuC=cK9y7q4J+n3+15w4148AohXEc1w@mail.gmail.com

Fix to not allow null treatment to non window functions.

The null treatment clause (RESPECT NULLS/IGNORE NULLS) are only
allowed to window functions per spec. Previously the check was only
applied to aggregates in window clause. Other types of functions were
allowed to use the clause, which was plain wrong.

To fix this, ParseFuncOrColumn() now checks whether other than window
functions are used with the null treatment clause. If so, error out.

Also remove the unnecessary test for "aggregate functions do not
accept RESPECT/IGNORE NULLS" because it is now checked in the
early-stage new check. The window regression test expected file is
changed accordingly.

Reported-by: jian he <jian.universality@gmail.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Author: Tatsuo Ishii <ishii@postgresql.org>
Discussion: https://postgr.es/m/CACJufxFnm%2BAj2Jyhyd58PtW8e1vTZDKimkZE%2BMashCPSDKw56Q%40mail.gmail.com

Fix another instability in recovery TAP test 004_timeline_switch

The test did not wait for the standby to be connected to the primary.
This breaks one assumption at the beginning of the test, where the
primary is stopped to ensure that all its records are flushed to both
standbys before moving on with its next steps.

If standby_1 finishes ahead of standby_2, the test would be able work
fine as the former waits for the latter. The opposite is not true,
standby_2 getting ahead of standby_1 would cause the test to fail on
timeout when standby_1 attempts to connect to standby_2.

This commit adds an additional polling query after the two standbys are
started, checking that both standbys are connected to the primary before
processing with the initial steps of the test.

Like 7185eddf0522, backpatch down to v14.

Author: Sergey Tatarintsev <s.tatarintsev@postgrespro.ru>
Reviewed-by: Ewan Young <kdbase.hack@gmail.com>
Discussion: https://postgr.es/m/fea4190e-f8b5-4432-a52d-bcbee5f34366@postgrespro.ru
Backpatch-through: 14

doc PG 19 relnotes: update to current

Reported-by: Chao Li

logical decoding: Correctly free speculative insertion

The error path in ReorderBufferProcessTXN was not freeing
(reorderbuffer.c's representation of) a speculative insertion record
correctly.  In assert-enabled builds, this leads to an assertion
failure.  In production builds, I see no effect; there may be a small
transient leak, but in an improbable code path such as this, such a leak
is not of any significance.  For users running with assertions enabled,
the crash is annoying.

Fix by having ReorderBufferProcessTXN() free the speculative insert
ahead of freeing the rest of the transaction, and no longer try to
handle that insert as a separate argument to ReorderBufferResetTXN().

This code came in with commit 7259736a6e5b (14-era).  Backpatch all the
way back.

In branches 14-16, also backpatch the assertion that originally fails in
the problem scenario, which was added by dbed2e36625d (originally
backpatched to 17), that at the end of ReorderBufferReturnTXN() the
in-memory size of the transaction is zero.

Author: Vishal Prasanna <vishal.g@zohocorp.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/19c7623e882.4080fd5426212.311756747309556767@zohocorp.com

concurrent repack: check there are no leftover toast attribs

Upon reading attributes from the file of concurrent changes, verify that
none are left over unprocessed after we read all columns for the tuple.
This should never happen, so add an elog(ERROR) for it.

While at it, downgrade a nearby message from ereport() to elog(). These
things should never happen.

Author: Aleksander Alekseev <aleksander@tigerdata.com>
Discussion: https://postgr.es/m/CAJ7c6TMSF7cANU8nEJ9E28EvU74tE4H7AzT292Rt3ZuHqqxq8w@mail.gmail.com

pg_restore: Use dependency-based matching for STATISTICS DATA

The previous approach introduced by 0dd93de69e80 was weak in terms of
name matching, as an --index=foo could match with a table with the same
name but from a different schema, pulling in more data than necessary.

For example, imagine the following case:
CREATE SCHEMA s1;
CREATE SCHEMA s2;
CREATE TABLE s1.foo (id int);
INSERT INTO s1.foo SELECT generate_series(1,100);
ANALYZE s1.foo;
CREATE TABLE s2.bar (id int);
CREATE INDEX foo ON s2.bar(id);
INSERT INTO s2.bar SELECT generate_series(1,100);
ANALYZE s2.bar;

A targetted pg_restore --index=foo would grab the relation and attribute
stats of s1.foo on top of the index s2.foo, which is incorrect. This
commit fixes this scenario by relying on a lookup of the dependencies of
a STATISTICS DATA TOC entry, checking if a TOC entry depends on an index
or another relkind before matching with the names of the objects wanted
for the restore.

Discussion: https://postgr.es/m/ajDBwpxs-otl585H@paquier.xyz
Backpatch-through: 18

Fix int32 overflow in ltree_compare()

The expression (len_diff * 10 * (an + 1)) used as the return value of
ltree_compare() is computed at int32 width.  With LTREE_MAX_LEVELS =
65535, the product can exceed INT32_MAX once an ltree has more than
~14,653 levels, which causes the result to wrap and invert its sign.
That corrupts btree ordering as well as the "magnitude" consumed by
ltree_penalty() for GiST page splits.

To fix, split ltree_compare() into two functions.  The new
ltree_compare_distance() function returns a float, which won't
overflow.  It's used by the ltree_penalty() caller.  All the other
callers only care about the sign of the return value, i.e. which of
the arguments is greater, so change ltree_compare() to not multiply
the result with (10 * (an + 1)), which avoids the overflow for those
callers.

Existing btree or GiST indexes on ltree columns containing values with
more than ~14,653 levels may be corrupt and should be REINDEXed.

Add a regression test based on the reporter's PoC.

Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reported-by: 王跃林 <violin0613@tju.edu.cn>
Discussion: https://www.postgresql.org/message-id/AI6AnABgKW93Qbx1jVzi84r9.8.1781322625756.Hmail.3020001251%40tju.edu.cn
Backpatch-through: 14

Reject oversized MCV lists in pg_restore_extended_stats()

import_mcv(), called by pg_restore_extended_stats(), allowed a list of
MCV items to be larger than the maximum supported when the stats are
loaded back in statext_mcv_deserialize() (STATS_MCVLIST_MAX_ITEMS or 10k
items). A follow-up attempt at loading them would cause a failure,
statext_mcv_deserialize() blocking any attempts.

Attempts at restoring MCV lists too long are now rejected, generating a
WARNING like other inconsistent inputs.

Author: Ewan Young <kdbase.hack@gmail.com>
Discussion: https://postgr.es/m/CAON2xHORd2ESXm1KcVeeZ0Kd_aJk4dL4M2WLtzVDM4puaZ-20w@mail.gmail.com

Fix various query jumble comments

Some comments for struct WindowFunc were trying to detail which fields
were irrelevant for query jumble but the list had not been kept
up-to-date. Here we fix that by removing the comment to allow the
"query_jumble_ignore" attribute to self-document. This involved
removing similar comments from other structs.

While we're on the topic, improve comments around why Consts only jumble
the "consttype" and also add some rationale about why various other fields
are ignored.

Reported-by: jian he <jian.universality@gmail.com>
Author: David Rowley <dgrowleyml@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxEWeP2SLVMsbFNynd0pQnwbxh6U-v1nq5ccf9mSvBZntw%40mail.gmail.com

pg_dump: Remove dead code in TAP tests

The schema_only_with_statistics test scenario was referenced in
002_pg_dump.pl, but was associated to no command sequence since
0ed92cf50cc4.

Issue discovered while investigating a different bug. Perhaps this
cleanup is not worth backpatching, but there is also an argument in
favor of reducing noise when touching this area of the code in stable
branches.

Reviewed-by: Ewan Young <kdbase.hack@gmail.com>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/ai-y0S7Z25NlrG_n@paquier.xyz
Backpatch-through: 18

Fix inconsistencies with pg_restore --statistics[-only]

Attempting to restore a schema, a table or an index with
--only-statistics skipped all the statistics of the objects wanted.
Like for pg_dump, statistics should be included, so this created an
assymetry between dump and restore.

A second set of problems existed for --table and --index, where the
presence of --statistics skipped the restore of the stats of the
object(s) targetted.

This issue has been reported originally as related to an inconsistency
with the way extended stats restore is handled in Postgres v19, but the
issue is related to the restore of relation and attribute statistics in
v18. Some TAP tests are added to cover all these cases.

Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Chao Li <li.evan.chao@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Discussion: https://postgr.es/m/66E80CAB-527C-42B1-BB65-3F82CF4AD998@gmail.com
Backpatch-through: 18

Clean up quoting of variable strings within replication commands.

Our handling of quoting within replication commands was pretty
sloppy, typically looking like
        appendStringInfo(&cmd, " SLOT \"%s\"", options->slotname);
This is fine as long as options->slotname doesn't contain a double
quote mark, but what if it does?  In principle this'd allow injection
of harmful options into replication commands, in the probably-unlikely
case that a slot name comes from untrustworthy input.  We ought to
clean that up.

Moreover, even the places that were trying to be more careful
generally got it wrong, because they used quoting subroutines
intended for SQL commands rather than something that will work
with the replication-command scanner repl_scanner.l.  For example,
several places naively use PQescapeLiteral() to quote option values
for replication commands.  If the string contains a backslash,
PQescapeLiteral() will produce E'...' literal syntax, which
repl_scanner.l doesn't recognize.  Another near miss was to use
quote_identifier() to quote identifiers.  That function won't quote
valid lowercase identifiers unless they match SQL keywords ... but in
this context, replication keywords are what matter.  Neither of these
errors seem to risk string injection, but they definitely can cause
syntax errors in replication commands that ought to be valid.

We can clean all this up by using simple quoting logic that just
doubles single or double quotes respectively.

Or at least, we could if repl_scanner.l handled doubled double quotes
in identifiers, but for some reason it doesn't!  So the first step in
this fix has to be to fix that.  (The fact that we'll later reject
slot names containing double quotes is very far short of justifying
this omission.)

Having done that, this patch runs around and applies correct
quoting in all places that generate replication commands containing
strings coming from outside the immediate context.  Probably some
of these places are safe because of restrictions elsewhere, but it
seems best to just quote all the time.

This was originally reported as a security bug, which it could be
if replication slot names or parameters were to originate from
untrustworthy sources.  But the security team concluded that that
was a very improbable situation, so we're just going to fix this
as a regular bug.

Reported-by: Team Dhiutsa
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/1648659.1781287310@sss.pgh.pa.us
Backpatch-through: 14

doc: Fix "Prev" link.

Presently, the "Prev" link on the page for background workers sends
you to the middle of the previous chapter instead of the actual
previous page. This appears to be caused by a libxml2 bug, but
regardless, a minimal fix is to change the link generation code to
use [position()=last()] instead of [last()] in the predicate on the
union of reverse axes.

Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/aim4AZorFKaC7Wrf%40nathan
Backpatch-through: 14

Doc: reword discussion of asterisk after table names in FROM.

The syntax "tablename *" has been obsolete for years, but we want to
retain it and its documentation for backward compatibility reasons.
However, the documentation wording was confusing and could be
understood to mean that "tablename *" is the same as "ONLY tablename".

Reported-by: Jochen Bandhauer <jochen.bandhauer@gmx.net>
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/178125831604.1285960.8250607197280951685@wrigleys.postgresql.org

Modernize pg_bsd_indent's error/warning reporting code.

Late-model clang complains that these functions should be labeled
with "format(printf, 2, 3)", and it's right.  But let's go a bit
further and also make use of varargs, to remove duplication and
allow these functions to be used with non-integer input values.

Since no good deed goes unpunished, I had to also adjust a couple
of call sites.  They weren't wrong as-is, since the size_t-sized
arguments were coerced to int on the way into diag3().  But
without that, we have to adjust the format strings.

The point of this is to suppress compiler warnings, so back-patch
into branches containing pg_bsd_indent, even though there's no
functional change.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Discussion: https://postgr.es/m/1645041.1781283554@sss.pgh.pa.us
Backpatch-through: 16

Fix PQdescribePrepared with more than 7498 params

If a query has more than 7498 params, the ParameterDescription message
exceeds the 30000 byte limit on messages that are not specifically
marked as possibly being longer than that (VALID_LONG_MESSAGE_TYPE).
To fix, add ParameterDescription to the list.

Author: Ning Sun <classicning@gmail.com>
Discussion: https://www.postgresql.org/message-id/dbfb4b65-0aa8-470a-8b87-b6496160b28a@gmail.com
Backpatch-through: 14

Trim regression test expected output for xml

This commit reduces the number of expected output files for the "xml"
test from three to two (well, mostly one, see below for details).

xml_2.out existed to handle some differences in output due to libxml2
2.9.3, due to some error context missing (085423e3e326).  This file is
removed, by tweaking the XML inputs to trigger the same error patterns
for the problematic 2.9.3 and other libxml2 versions.  This part is
authored by Tom Lane.

xml_1.out (no libxml2 support) is reduced in size by adding an \if query
that exits the test early.  This still checks NO_XML_SUPPORT() through
xmlin().  The rest of the test is skipped if XML input cannot be
handled by the backend.  This part has been written by me.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/aiu6CXO67q-s70n5@paquier.xyz
Backpatch-through: 14

Doc: remove stale entry for removed aclitem[] ~ aclitem operator.

Commit 2f70fdb06 removed the deprecated containment operator
~(aclitem[],aclitem) from the catalogs, but missed removing its entry
from the documentation. (Arguably the blame should fall on c62dd80cd,
which added this entry in contravention of the longstanding policy
that we don't document deprecated aliases in the first place.)

Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOzEurQSyR5psWukyhUz1LtxyO55C2Vfp0Fmt8w2jGKxhszQmQ@mail.gmail.com
Backpatch-through: 14

Fix oversight in commit aa1f93a33.

Since the remote column names of a foreign table could be longer than
NAMEDATALEN, remattrmap_cmp(), which compares such column names, should
have used strcmp(), not strncmp() with n=NAMEDATALEN.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/81D981EB-ECC1-495D-8EAC-5CFB67B2CF77%40gmail.com

amcheck: Use correct varlena size accessor in bt_normalize_tuple()

bt_normalize_tuple() uses VARSIZE() to get the size of varlena, even though
it's not yet known, that it has a 4-byte header. Fix this by replacing a
accessor with a universal VARSIZE_ANY().

Backpatch to all supported versions.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/7ckc7oka4bvafkf5bwlqs6ygrhlsbhz25ppozfch7zbuxcx3rf%40e4pr4oqenalc
Author: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Backpatch-through: 14

Adjust cross-version upgrade tests for seg_out() fix

Commit 0e1f1ed157e taught seg_out() to print the certainty indicator
on an interval's upper boundary, but it was back-patched only as far
as v14. When upgrading from an older release, the old server prints
the one test_seg row exercising that case ('4.6 .. ~7.0') without the
indicator, so the pre- and post-upgrade dumps do not match. Make
AdjustUpgrade.pm delete just that row; seg's comparison function does
distinguish the certainty indicators, so the otherwise identical row
'4.6 .. 7.0' is unaffected.

Back-patch to all supported branches.

Per buildfarm members crake and fairywren.

Discussion: https://postgr.es/m/5ccbdbde-6467-4a10-bf4d-0be73a05ce8d@dunslane.net

Fix translatable string construction in psql

Similar to commit 3692a622d3fd, for a slightly different code pattern in
psql.

No backpatch to avoid disrupting translation in stable branches.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/airjxKXx7aTG8kfE@alvherre.pgsql

Update expected regression test output for xml_2.out

This one has been forgotten in 8bf257aebac1. Per report from buildfarm
member massasauga.

Backpatch-through: 14

Fix second race with timeline selection during promotion

read_local_xlog_page_guts has the same race as logical_read_xlog_page:
RecoveryInProgress() can return true during promotion, impacting the
availability of the operations doing WAL page reads with this callback.

This problem is similar to eb4e7224a1c6 that has addressed the issue for
logical replication, impacting more areas of the code where this WAL
page callback can be used (same narrow window during promotion, same
availability issue):
- pg_walinspect.
- Slot advance (SQL function).
- Slot creation.

Repack workers (v19~) and 2PC files (since forever) can also use this
callback, but they are irrelevant as far as I know.  A test is added
with the SQL lookup functions.  This part relies on injection points,
and is backpatched down to v18, like the test added for eb4e7224a1c6.

This issue could probably be fixed as well in v14 and v15 for
pg_walinspect.  However, I also feel that there is a conservative
argument about consistency here due to the support of logical decoding
on standbys, so let's limit ourselves to v16 for now.  pg_walinspect is
used less in the field compared to the two other operations, making
addressing this problem less attractive in these two older branches.

Reported-by: Xuneng Zhou <xunengzhou@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Discussion: https://postgr.es/m/7daef094-abf3-4672-bc23-3df4763b16a3%40gmail.com
Backpatch-through: 16

Confine RI fast-path batching to the top transaction level

The FK fast-path batching added in b7b27eb41a5 buffers rows in a
transaction-lived cache (ri_fastpath_cache) keyed by constraint OID.
Running user-defined cast and equality functions during a batch flush,
together with the cache's lifetime and iteration, exposed two defects
reachable by an unprivileged table owner.

First, on subtransaction abort ri_FastPathSubXactCallback discarded the
entire cache.  An entry's batch holds rows buffered by the enclosing
transaction, not just the aborting subxact -- the cache is keyed by
constraint, so a single entry can mix rows from multiple subxact levels.
An internal subxact abort during after-trigger firing (e.g. a PL/pgSQL
BEGIN ... EXCEPTION block) therefore dropped buffered rows of the outer
transaction without running their FK checks, letting orphan rows commit
behind a constraint that still reported itself valid.  The discard also
left relations opened by the batch unclosed, producing "resource was not
closed" warnings.

Second, ri_FastPathEndBatch flushes by iterating the cache with
hash_seq_search.  If flush-time user code inserts into a different
fast-path FK table, a new entry is added to the cache mid-scan; it may
land in a bucket the scan has already passed and never be reached, and
ri_FastPathTeardown then destroys the cache without flushing it,
silently dropping that check.

Cleanly unwinding the cache on subxact abort would require tracking the
originating subxact of each buffered row, since rows from different
levels share an entry (the cache is keyed by constraint) and deferred
constraints cannot be flushed early at a subxact boundary.  Rather than
add that bookkeeping, confine batching to the top transaction level: in
RI_FKey_check, when GetCurrentTransactionNestLevel() > 1, use the
per-row fast path (ri_FastPathCheck) instead of buffering.  Rows checked
inside a subtransaction are then verified immediately and roll back
cleanly with their subtransaction, and the cache only ever holds
top-level rows.  With the cache confined to the top level, a
subtransaction abort has nothing of its own to discard, so
ri_FastPathSubXactCallback is removed along with its registration.

For the second defect, add a cache-wide flag (ri_fastpath_flushing) set
while ri_FastPathEndBatch iterates the cache.  A re-entrant FK check
arriving while the flag is set takes the per-row path rather than adding
an entry to the cache being scanned, so no entry can be missed and torn
down unflushed.  The flag is cleared in a PG_FINALLY so a flush that
throws (a reported violation or an error from user code) does not leave
it stuck.  As defensive insurance it is also cleared in
ri_FastPathXactCallback() at transaction end.

The per-row fast path still bypasses SPI and stays well ahead of the
pre-19 SPI-based check.  A fuller fix that preserves batching across
subtransactions -- whether by tracking the originating subxact of each
buffered row or by per-subxact cache stacks merged into the parent on
commit -- is left for a future release.

The subtransaction-abort case is covered by a new regression test.  The
mid-scan cross-table case depends on hash bucket placement and so is not
reliably reproducible in a portable test, but the flag prevents it by
construction.

Reported-by: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAM527d9exRCdWrhJOnAxk_vACg7sr_yPoaJp_+uCFY0qP8v=aw@mail.gmail.com

doc: fix reference for finding replication slots to drop

Commit a70bce43fb added instructions on how to recover if PostgreSQL
refuses to issue new transaction IDs because of imminent wraparound,
but when describing how to find replication slots that should be dropped,
it referred to pg_stat_replication where it should have referenced
pg_replication_slots.

In passing, decorate references to views with <structname> tags.

Backpatch to all supported versions.

Reported-By: Sanjaya Waruna <sanjaya.waruna@gmail.com>
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/176767268098.1084085.10345048667224193115@wrigleys.postgresql.org
Backpatch-through: 14

Fix out-of-bounds write in RI fast-path batch on re-entry

The FK fast-path batching added in b7b27eb41a5 wrote the incoming row
into the batch array before checking whether the array was full:

    fpentry->batch[fpentry->batch_count] = ExecCopySlotHeapTuple(newslot);
    fpentry->batch_count++;
    if (fpentry->batch_count >= RI_FASTPATH_BATCH_SIZE)
        ri_FastPathBatchFlush(fpentry, fk_rel, riinfo);

batch_count is reset to zero only at the end of ri_FastPathBatchFlush(),
so it remains at RI_FASTPATH_BATCH_SIZE throughout a full-batch flush.
A flush runs user-defined cast functions and equality operators; if that
user code performs DML on the same FK table, ri_FastPathBatchAdd()
re-enters with batch_count == RI_FASTPATH_BATCH_SIZE and writes one past
the end of the array, corrupting the adjacent batch_count field.  This
is reachable by an unprivileged table owner via an implicit cast with a
PL/pgSQL function and causes a SIGSEGV in assert-enabled builds.

Fix by bounds-checking the write into the batch array so a re-entrant
add can never write past the end, and by adding a "flushing" flag to
RI_FastPathEntry that routes re-entrant ri_FastPathBatchAdd() calls on
a busy entry to the per-row path (ri_FastPathCheck) instead of touching
the mid-flush batch array.  The flag is set around the probe in
ri_FastPathBatchFlush() and cleared in a PG_FINALLY, which also resets
batch_count, so the entry is left empty and reusable if a flush error
(including a reported FK violation) is caught by a savepoint.

Add regression tests for both the re-entrant flush and reuse of an entry
after a flush error caught by a savepoint.

Reported-by: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Nikolay Samokhvalov <nik@postgres.ai>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: https://postgr.es/m/CAM527d9exRCdWrhJOnAxk_vACg7sr_yPoaJp_+uCFY0qP8v=aw@mail.gmail.com

Fix handling of namespace nodes in xpath() (xml)

xpath() attempted to call xmlCopyNode() and xmlNodeDump() on a
XML_NAMESPACE_DECL, finishing with a confusing error:
=# SELECT xpath('//namespace::foo', '<root xmlns:foo="http://127.0.0.1"/>');
ERROR:  53200: could not copy node
CONTEXT:  SQL function "xpath" statement 1

xpath() is changed so as it goes through xmlXPathCastNodeToString()
instead, that is able to handle namespace nodes.  xml2 uses the same
solution.  This issue has been discovered while digging into
9d33a5a804db.

Author: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aioT7ui_ZJ9RMlfM@paquier.xyz
Backpatch-through: 14

amcheck: Fix missing allequalimage corruption report

When amcheck validates that a B-Tree metapage's allequalimage flag
matches _bt_allequalimage(), it could fail to report corruption
unless one of the index key columns used interval_ops. As a result,
pg_amcheck could silently miss this corruption on other opclasses,
incorrectly reporting the index as valid.

The mistake was that bt_index_check_callback() kept ereport(ERROR)
inside the loop that scans key attributes for INTERVAL_BTREE_FAM_OID,
even though that loop is only needed to decide whether to add
the interval-specific hint. This commit moves ereport() out of the loop
so allequalimage mismatches are always reported, while still emitting
the hint for affected interval indexes.

Back-patch to v18, where d70b17636dd introduced this regression
while moving the check into bt_index_check_callback().

Author: Chao Li <lic@highgo.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/011ACC9C-CB87-4160-ACE7-4ED57AB86E15@gmail.com
Backpatch-through: 18

Fix md5_password_warnings for role and database settings

MD5 authentication warnings are queued during authentication, before
startup options and role/database settings have been applied. The code
checked md5_password_warnings at queue time, so settings such as
ALTER ROLE ... SET md5_password_warnings = off did not suppress the
warning, even though the established session showed the GUC as off.

Keep the connection-warning infrastructure generic by allowing each
queued warning to carry an optional filter callback. Evaluate that
callback when warnings are emitted, after startup options and
role/database settings have been processed.

Use this for MD5 authentication warnings, while leaving password
expiration warnings unchanged. Add test coverage for an MD5-authenticated
role with md5_password_warnings disabled.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/AE46E42D-5966-4D76-9E64-95EAB01B9FB5@gmail.com

Fix type confusion in AddRelsyncInvalidationMessage

Since this is trying to add a SharedInvalRelSyncMsg rather than
a SharedInvalRelcacheMsg, it should use rs rather than rc.

This makes no difference as things stand, because the two structure
definitions are identical (except for the capitalization of "relid"),
but it's still a good idea to fix it.

Co-authored-by: Stolpovskikh Danil <d.stolpovskikh@ftdata.ru>
Co-authored-by: Robert Haas <rhaas@postgresql.org>
Discussion: http://postgr.es/m/bd6a5735b72b4afe99af49c3c62901d6@localhost.localdomain

Fix translatable string construction

In a few places, we were constructing translatable strings consisting of
elements list by adding one element at a time and separately a comma.
This is not great from a translation point of view, so rewrite to append
the comma together with the corresponding element in one go.

Author: Peter Smith <smithpb2250@gmail.com>
Author: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CAHut+Pvp7jYcaiZ3pXedXgLcWZWDBLXFUK05JtZpGv3Mj=UOjw@mail.gmail.com

IS JSON/JSON(): Protect against expressions uncoercible to text

transformJsonParseArg() was not careful enough on generation of
transformed expressions when starting from expressions that are not
coercible to text but are in the string type category: it failed to
verify that coerce_to_target_type() succeeds, and returned a NULL
pointer. This leads to a later NULL dereference and crash at executor
time.

This escaped noticed because it cannot happen for built-in types, all of
which have casts to text. Only user-created types are potentially
problematic.

Fix by raising an error when a cast to text doesn't exist.

This mistake came in with commit 6ee30209a6f1.

Author: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reported-by: Chi Zhang <798604270@qq.com>
Reviewed-by: Srinath Reddy Sadipiralla <srinath2133@gmail.com>
Backpatch-through: 16
Discussion: https://postgr.es/m/19491-7aafc221ec63f288@postgresql.org

Fix parsing of parenthesised OLD/NEW in RETURNING list.

When parsing expressions like (old).colname and (old).* in a RETURNING
list, the parser would lose track of the intended varreturningtype,
and therefore return incorrect results.

The root cause was code using GetNSItemByRangeTablePosn() to find a
namespace item from its rtindex and levelsup, without taking into
account returningtype, which would return the wrong namespace item.
Fix by adding a new function GetNSItemByVar() that does take
returningtype into account.

Backpatch to v18, where support for RETURNING OLD/NEW was added.

Bug: #19516
Reported-by: Marko Grujic <markoog@gmail.com>
Author: Marko Grujic <markoog@gmail.com>
Suggested-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAOvwyF2cO_5mAt=w=y-dFnaG5UkZ+3H8nSDoKF_iuWZHsU2ARg@mail.gmail.com
Backpatch-through: 18

seg: Fix seg_out() to preserve the upper boundary's certainty indicator

When printing the upper boundary of a seg interval, seg_out() decided
whether to emit the certainty indicator ('<', '>' or '~') by testing the
upper indicator (u_ext) for '<' and '>', but mistakenly tested the lower
indicator (l_ext) for '~'.  This is a copy-and-paste slip from the
symmetric code that prints the lower boundary a few lines above.

The consequences for valid input were:

  * A '~' on the upper boundary was dropped on output, e.g.
    '1.5 .. ~2.5'::seg printed as '1.5 .. 2.5'.

  * When the lower boundary carried '~' but the upper boundary had no
    indicator, the wrong test matched and sprintf(p, "%c", seg->u_ext)
    wrote a NUL byte (u_ext == '\0'), which truncated the result string
    and silently lost the entire upper boundary, e.g.
    '~6.5 .. 8.5'::seg printed as '~6.5 .. '.

Certainty indicators are documented to be preserved on output (they are
ignored by the operators, but kept as comments), so this broke the
input/output round-trip for the affected values.

The bug has existed since seg was added.  It went unnoticed because the
existing regression tests only exercised certainty indicators on
single-point segs, which are printed by a different branch of seg_out().
Add tests that place indicators on both boundaries of an interval.

Author: Ewan Young <kdbase.hack@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAON2xHPYeRRCEVAv8XfE18KsEsEHCiYcJ5fOsoxFuMEfpxF1=g@mail.gmail.com
Backpatch-through: 14

Fix race with timeline selection in logical decoding during promotion

During promotion, there is a window where RecoveryInProgress() returns
true but the WAL segments of the old timeline have already been removed.
A logical decoding could pick up the old timeline in this window when
reading a page, failing with the following error:
ERROR: requested WAL segment ... has already been removed

This issue does not lead to any data correctness issue, as retrying to
decode the data works in follow-up decoding attempts.  It impacts
availability, though.  Other WAL page read callbacks have a similar
issue, this commit takes care of what should be the noisiest code path:
logical decoding with START_REPLICATION in a WAL sender.

A TAP test, based on an injection point waiting in the startup process
after the segments have been removed/recycled, is added.  This part is
backpatched down to v17.

This issue has been causing sporadic failures in the buildfarm, and
was reproducible manually.  This issue happens since logical decoding on
standbys exists, down to v16.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>
Discussion: https://postgr.es/m/7daef094-abf3-4672-bc23-3df4763b16a3@gmail.com
Backpatch-through: 16

Disallow negative values for max_retention_duration.

The subscription option max_retention_duration accepts an integer value
representing a timeout in milliseconds, where zero means unlimited
retention (no timeout). Negative values have no useful meaning, but were
silently accepted and stored in the subscription catalog.

A negative value causes should_stop_conflict_info_retention() to always
return true, because TimestampDifferenceExceeds() treats a negative
threshold as already exceeded. This stops dead tuple retention
immediately rather than honoring the configured timeout.

Fix by rejecting negative values for max_retention_duration during CREATE
SUBSCRIPTION and ALTER SUBSCRIPTION.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/9232401A-DEEE-49E1-9D11-D14A776DB82B@gmail.com

xml2: Fix crash with namespace nodes in xpath_nodeset()

pgxmlNodeSetToText() passed nodeTab[i]->doc to xmlNodeDump() without
checking the node type, which could cause a crash as a
XML_NAMESPACE_DECL maps to a xmlNs struct. The passed-in code would
then be dereferenced in xmlNodeDump().

This commit switches the code to render XML_NAMESPACE_DECL nodes with
xmlXPathCastNodeToString(), like xpath_table(). Some tests are added,
written by me.

Author: Andrey Chernyy <andrey.cherny@tantorlabs.com>
Co-authored-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/20260611031436.5afde3cb@andrnote
Backpatch-through: 14

Undo thinko in commit e78d1d6d4.

In pursuit of removing a Valgrind-detected leak, I inserted
"pfree(pq_mq_handle);" into mq_putmessage's recursion-trouble-recovery
code path, failing to notice that shm_mq_detach would have pfree'd
that block just before (i.e., this particular code path did not leak).
So now that was a double pfree. We didn't notice because the
recursion scenario isn't exercised in our regression tests, but
Alexander Lakhin found it via code fuzzing.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/b8b40954-e155-41b3-9af8-ad4f261a1b64@gmail.com

Fix MarkBufferDirtyHint() to not call GetBufferDescriptor() for local buffers

GetBufferDescriptor() was called before checking if the buffer is local.
Such buffers have a negative ID, meaning that we could call
GetBufferDescriptor() with a wrapped-around uint32 value causing a
potential out-of-bound access to the BufferDescriptors array.

This is harmless in the existing code for the current uses of
MarkBufferDirtyHint(), but the author has found a way to make that
buggy while working on a different patch set, and the order of the
operations is wrong.

Oversight in 82467f627bd4. No backpatch is required, as this is new to
v19.

Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CAExHW5uzRMYVZsXXS3HXXT0fG_sNrpUhUqwP4NorhaCqH9JDhA@mail.gmail.com

pg_buffercache: restore rowtype verification in pg_buffercache_pages()

Commit 257c8231bf9 changed pg_buffercache_pages() to materialize its output
directly into a tuplestore. As a result, the function ended up trusting
a caller-supplied RECORD descriptors. That could lead to crashes
if the supplied row definition did not match the actual returned values,
for example by passing bool Datums to tuplestore_putvalues() with
an incompatible descriptor.

Fix this by constructing the correct tuple descriptor for
pg_buffercache_pages() and assigning it to
rsinfo->setDesc after InitMaterializedSRF(). This restores the executor's
tupledesc_match() verification, so incompatible caller-supplied
row definitions are rejected with an error, as before commit 257c8231bf9.

Bug: #19508
Reported-by: Nikita Kalinin <n.kalinin@postgrespro.ru>
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Ayush Tiwari <ayushtiwari.slg01@gmail.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
Discussion: https://postgr.es/m/19508-e5f188183279219b@postgresql.org