git.ipfire.org Git - thirdparty/postgresql.git/log

Check for unterminated strings when calling uloc_getLanguage().

Missed by commit 1671f990dd66.

Author: Andreas Karlsson <andreas@proxel.se>
Discussion: https://postgr.es/m/118ca69e-47eb-42e1-83e9-72ccf40dd6fd@proxel.se
Backpatch-through: 16

Add tests for low-level PGLZ [de]compression routines

The goal of this module is to provide an entry point for the coverage of
the low-level compression and decompression PGLZ routines. The new test
is moved to a new parallel group, with all the existing
compression-related tests added to it.

This includes tests for the cases detected by fuzzing that emulate
corrupted compressed data, as fixed by 2b5ba2a0a141:
- Set control bit with read of a match tag, where no data follows.
- Set control bit with read of a match tag, where 1 byte follows.
- Set control bit with match tag where length nibble is 3 bytes
(extended case).

While on it, some tests are added for compress/decompress roundtrips,
and for check_complete=false/true. Like 2b5ba2a0a141, backpatch to all
the stable branches.

Discussion: https://postgr.es/m/adw647wuGjh1oU6p@paquier.xyz
Backpatch-through: 14

Fix overrun when comparing with unterminated ICU language string.

The overrun was introduced in commit c4ff35f10.

Author: Andreas Karlsson <andreas@proxel.se>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/96d80a47-f17f-42fa-82b1-2908efbd6541@gmail.com
Backpatch-through: 18

Fix excessive logging in idle slotsync worker.

The slotsync worker was incorrectly identifying no-op states as successful
updates, triggering a busy loop to sync slots that logged messages every
200ms. This patch corrects the logic to properly classify these states,
enabling the worker to respect normal sleep intervals when no work is
performed.

Reported-by: Fujii Masao <masao.fujii@gmail.com>
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Backpatch-through: 17, where it was introduced
Discussion: https://postgr.es/m/CAHGQGwF6zG9Z8ws1yb3hY1VqV-WT7hR0qyXCn2HdbjvZQKufDw@mail.gmail.com

Honor passed-in database OIDs in pgstat_database.c

Three routines in pgstat_database.c incorrectly ignore the database OID
provided by their caller, using MyDatabaseId instead:
- pgstat_report_connect()
- pgstat_report_disconnect()
- pgstat_reset_database_timestamp()

The first two functions, for connection and disconnection, each have a
single caller that already passes MyDatabaseId.  This was harmless,
still incorrect.

The timestamp reset function also has a single caller, but in this case
the issue has a real impact: it fails to reset the timestamp for the
shared-database entry (datid=0) when operating on shared objects.  This
situation can occur, for example, when resetting counters for shared
relations via pg_stat_reset_single_table_counters().

There is currently one test in the tree that checks the reset of a
shared relation, for pg_shdescription, we rely on it to check what is
stored in pg_stat_database.  As stats_reset may be NULL, two resets are
done to provide a baseline for comparison.

Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Dapeng Wang <wangdp20191008@gmail.com>
Discussion: https://postgr.es/m/ABBD5026-506F-4006-A569-28F72C188693@gmail.com
Backpatch-through: 15

Fix estimate_array_length error with set-operation array coercions

When a nested set operation's output type doesn't match the parent's
expected type, recurse_set_operations builds a projection target list
using generate_setop_tlist with varno 0. If the required type
coercion involves an ArrayCoerceExpr, estimate_array_length could be
called on such a Var, and would pass it to examine_variable, which
errors in find_base_rel because varno 0 has no valid relation entry.

Fix by skipping the statistics lookup for Vars with varno 0.

Bug introduced by commit 9391f7152. Back-patch to v17, where
estimate_array_length was taught to use statistics.

Reported-by: Justin Pryzby <pryzby@telsasoft.com>
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/adjW8rfPDkplC7lF@pryzbyj2023
Backpatch-through: 17

read_stream: Remove obsolete comment.

This comment was describing the v17 implementation (or io_method=sync).

Backpatch-through: 18

Fix heap-buffer-overflow in pglz_decompress() on corrupt input.

When decoding a match tag, pglz_decompress() reads 2 bytes (or 3
for extended-length matches) from the source buffer before checking
whether enough data remains. The existing bounds check (sp > srcend)
occurs after the reads, so truncated compressed data that ends
mid-tag causes a read past the allocated buffer.

Fix by validating that sufficient source bytes are available before
reading each part of the match tag. The post-read sp > srcend
check is no longer needed and is removed.

Found by fuzz testing with libFuzzer and AddressSanitizer.

Backpatch-through: 14

Fix incremental JSON parser numeric token reassembly across chunks.

When the incremental JSON parser splits a numeric token across chunk
boundaries, it accumulates continuation characters into the partial
token buffer. The accumulator's switch statement unconditionally
accepted '+', '-', '.', 'e', and 'E' as valid numeric continuations
regardless of position, which violated JSON number grammar
(-? int [frac] [exp]). For example, input "4-" fed in single-byte
chunks would accumulate the '-' into the numeric token, producing an
invalid token that later triggered an assertion failure during
re-lexing.

Fix by tracking parser state (seen_dot, seen_exp, prev character)
across the existing partial token and incoming bytes, so that each
character class is accepted only in its grammatically valid position.

Backpatch-through: 17

Zero-fill private_data when attaching an injection point

InjectionPointAttach() did not initialize the private_data buffer of the
shared memory entry before (perhaps partially) overwriting it. When the
private data is set to NULL by the caler, the buffer was left
uninitialized. If set, it could have stale contents.

The buffer is initialized to zero, so as the contents recorded when a
point is attached are deterministic.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tsGHu2h6YLnVu4HiK05q+gTE_9WVUAqihW2LSscAYS-g@mail.gmail.com
Backpatch-through: 17

Fix integer overflow in nodeWindowAgg.c

In nodeWindowAgg.c, the calculations for frame start and end positions
in ROWS and GROUPS modes were performed using simple integer addition.
If a user-supplied offset was sufficiently large (close to INT64_MAX),
adding it to the current row or group index could cause a signed
integer overflow, wrapping the result to a negative number.

This led to incorrect behavior where frame boundaries that should have
extended indefinitely (or beyond the partition end) were treated as
falling at the first row, or where valid rows were incorrectly marked
as out-of-frame.  Depending on the specific query and data, these
overflows can result in incorrect query results, execution errors, or
assertion failures.

To fix, use overflow-aware integer addition (ie, pg_add_s64_overflow)
to check for overflows during these additions.  If an overflow is
detected, the boundary is now clamped to INT64_MAX.  This ensures the
logic correctly treats the boundary as extending to the end of the
partition.

Bug: #19405
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/19405-1ecf025dda171555@postgresql.org
Backpatch-through: 14

Strip PlaceHolderVars from partition pruning operands

When pulling up a subquery, its targetlist items may be wrapped in
PlaceHolderVars to enforce separate identity or as a result of outer
joins.  This causes any upper-level WHERE clauses referencing these
outputs to contain PlaceHolderVars, which prevents partprune.c from
recognizing that they match partition key columns, defeating partition
pruning.

To fix, strip PlaceHolderVars from operands before comparing them to
partition keys.  A PlaceHolderVar with empty phnullingrels appearing
in a relation-scan-level expression is effectively a no-op, so
stripping it is safe.  This parallels the existing treatment in
indxpath.c for index matching.

In passing, rename strip_phvs_in_index_operand() to strip_noop_phvs()
and move it from indxpath.c to placeholder.c, since it is now a
general-purpose utility used by both index matching and partition
pruning code.

Back-patch to v18.  Although this issue exists before that, changes in
that version made it common enough to notice.  Given the lack of field
reports for older versions, I am not back-patching further.  In the
v18 back-patch, strip_phvs_in_index_operand() is retained as a thin
wrapper around the new strip_noop_phvs() to avoid breaking third-party
extensions that may reference it.

Reported-by: Cándido Antonio Martínez Descalzo <candido@ninehq.com>
Diagnosed-by: David Rowley <dgrowleyml@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CAH5YaUwVUWETTyVECTnhs7C=CVwi+uMSQH=cOkwAUqMdvXdwWA@mail.gmail.com
Backpatch-through: 18

Fix ABI break by moving PROCSIG_SLOTSYNC_MESSAGE in ProcSignalReason

Commit 58c1188a3ea added PROCSIG_SLOTSYNC_MESSAGE in the middle of
enum ProcSignalReason, breaking the ABI.

Fix this by moving PROCSIG_SLOTSYNC_MESSAGE to the end of the enum,
to restore ordering.

Per buildfarm member crake.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwH_AAbtsiYDJt65N7_4PJ0CgOJmBEaCq68e5_tcuG_vXw@mail.gmail.com
Backpatch-through: 18 only

Fix slotsync worker blocking promotion when stuck in wait

Previously, on standby promotion, the startup process sent SIGUSR1 to
the slotsync worker (or a backend performing slot synchronization) and
waited for it to exit. This worked in most cases, but if the process was
blocked waiting for a response from the primary (e.g., due to a network
failure), SIGUSR1 would not interrupt the wait. As a result, the process
could remain stuck, causing the startup process to wait for a long time
and delaying promotion.

This commit fixes the issue by introducing a new procsignal reason,
PROCSIG_SLOTSYNC_MESSAGE. On promotion, the startup process
sends this signal, and the handler sets interrupt flags so the process
exits (or errors out) promptly at CHECK_FOR_INTERRUPTS(), allowing
promotion to complete without delay.

Backpatch to v17, where slotsync was introduced.

Author: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwFzNYroAxSoyJhqTU-pH=t4Ej6RyvhVmBZ91Exj_TPMMQ@mail.gmail.com
Backpatch-through: 17

Enhance slot synchronization API to respect promotion signal.

Previously, during a promotion, only the slot synchronization worker was
signaled to shut down. The backend executing slot synchronization via the
pg_sync_replication_slots() SQL function was not signaled, allowing it to
complete its synchronization cycle before exiting.

An upcoming patch improves pg_sync_replication_slots() to wait until
replication slots are fully persisted before finishing. This behaviour
requires the backend to exit promptly if a promotion occurs.

This patch ensures that, during promotion, a signal is also sent to the
backend running pg_sync_replication_slots(), allowing it to be interrupted
and exit immediately.

This change was originally committed to master only. However, backpatch
it to v17, where slot synchronization was introduced. Because it is required
for an upcoming bug fix addressing slotsync (including
pg_sync_replication_slots()) blocking promotion when stuck in a wait.

Author: Ajin Cherian <itsajin@gmail.com>
Reviewed-by: Shveta Malik <shveta.malik@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/CAFPTHDZAA%2BgWDntpa5ucqKKba41%3DtXmoXqN3q4rpjO9cdxgQrw%40mail.gmail.com
Backpatch-through: 17

Fix WITHOUT OVERLAPS' interaction with domains.

UNIQUE/PRIMARY KEY ... WITHOUT OVERLAPS requires the no-overlap
column to be a range or multirange, but it should allow a domain
over such a type too. This requires minor adjustments in both
the parser and executor.

In passing, fix a nearby break-instead-of-continue thinko in
transformIndexConstraint. This had the effect of disabling
parse-time validation of the no-overlap column's type in the context
of ALTER TABLE ADD CONSTRAINT, if it follows a dropped column.
We'd still complain appropriately at runtime though.

Author: Jian He <jian.universality@gmail.com>
Reviewed-by: Paul A Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxGoAmN_0iJ=hjTG0vGpOSOyy-vYyfE+-q0AWxrq2_p5XQ@mail.gmail.com
Backpatch-through: 18

Fix shmem allocation of fixed-sized custom stats kind

StatsShmemSize(), that computes the shmem size needed for pgstats,
includes the amount of shared memory wanted by all the custom stats
kinds registered. However, the shared memory allocation was done by
ShmemAlloc() in StatsShmemInit(), meaning that the space reserved was
not used, wasting some memory.

These extra allocations would show up under "<anonymous>" in
pg_shmem_allocations, as the allocations done by ShmemAlloc() are not
tracked by ShmemIndexEnt.

Issue introduced by 7949d9594582.

Author: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/04b04387-92f5-476c-90b0-4064e71c5f37@iki.fi
Backpatch-through: 18

Fix shared memory size of template code for custom fixed-sized pgstats

On HEAD, the template code for custom fixed-sized pgstats is in the test
module test_custom_stats.  On REL_18_STABLE, this code lives in the test
module injection_points.

Both cases were underestimating the size of the shared memory area
required for the storage of the stats data, using a single entry rather
than the whole area.  This underestimation meant that there was no
memory allocated for the LWLock required for the stats, and even more.
This problem would be also misleading for extension developers looking
at this code.

This issue has been noticed while digging into a different bug reported
by Heikki Linnakangas, showing that the underestimation was causing
failures in the TAP tests of the test modules for 32-bit builds.  The
other issue reported, related to the memory allocation of custom
fixed-sized pgstats, will be fixed in a follow-up commit.

Discussion: https://postgr.es/m/adMk_lWbnz3HDOA8@paquier.xyz
Backpatch-through: 18

Avoid unsafe access to negative index in a TupleDesc.

Commit aa606b931 installed a test that would reference a nonexistent
TupleDesc array entry if a system column is used in COPY FROM WHERE.
Typically this would be harmless, but with bad luck it could result
in a phony "generated columns are not supported in COPY FROM WHERE
conditions" error, and at least in principle it could cause SIGSEGV.
(Compare 570e2fcc0 which fixed the identical problem in another
place.) Also, since c98ad086a it throws an Assert instead.

In the back branches, just guard the test to make it a safe no-op for
system columns. Commit 21c69dc73 installed a more aggressive answer
in master.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/6f435023-8ab6-47c2-ba07-035d0c4212f9@gmail.com
Backpatch-through: 14-18

Fix null-bitmap combining in array_agg_array_combine().

This code missed the need to update the combined state's
nullbitmap if state1 already had a bitmap but state2 didn't.
We need to extend the existing bitmap with 1's but didn't.
This could result in wrong output from a parallelized
array_agg(anyarray) calculation, if the input has a mix of
null and non-null elements. The errors depended on timing
of the parallel workers, and therefore would vary from one
run to another.

Also install guards against integer overflow when calculating
the combined object's sizes, and make some trivial cosmetic
improvements.

Author: Dmytro Astapov <dastapov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFQUnFj2pQ1HbGp69+w2fKqARSfGhAi9UOb+JjyExp7kx3gsqA@mail.gmail.com
Backpatch-through: 16

More tar portability adjustments.

For the three implementations that have caused problems so far:

* GNU and BSD (libarchive) tar both understand --format=ustar
* ustar doesn't support large UID/GID values, so set them to 0 to
  avoid a hard error from at least GNU tar
* OpenBSD tar needs -F ustar, and it appears to warn but carry
  on with "nobody" if a UID is too large
* -f /dev/null is a more portable way to throw away the output, since
  the default destination might be a tape device depending on build
  options that a distribution might change
* Windows ships BSD tar but lacks /dev/null, so ask perl for its name

Based on their manuals, the other two implementations the tests are
likely to encounter in the wild don't seem to need any special handling:

* Solaris/illumos tar uses ustar and replaces large UIDs with 60001
* AIX tar uses ustar (unless --format=pax) and truncates large UIDs

Backpatch-through: 18
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Sami Imseih <samimseih@gmail.com> (large UIDs)
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version)
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com> (OpenBSD)
Reviewed-by: Andrew Dunstan <andrew@dunslane.net> (Windows)
Discussion: https://postgr.es/m/3676229.1775170250%40sss.pgh.pa.us
Discussion: https://postgr.es/m/CAA5RZ0tt89MgNi4-0F4onH%2B-TFSsysFjMM-tBc6aXbuQv5xBXw%40mail.gmail.com

jit: No backport::SectionMemoryManager for LLVM 22.

LLVM 22 has the fix that we copied into our tree in commit 9044fc1d and
a new function to reach it[1][2], so we only need to use our copy for
Aarch64 + LLVM < 22.  The only change to the final version that our copy
didn't get is a new LLVM_ABI macro, but that isn't appropriate for us.
Our copy is hopefully now frozen and would only need maintenance if bugs
are found in the upstream code.

Non-Aarch64 systems now also use the new API with LLVM 22.  It allocates
all sections with one contiguous mmap() instead of one per
section.  We could have done that earlier, but commit 9044fc1d wanted to
limit the blast radius to the affected systems.  We might as well
benefit from that small improvement everywhere now that it is available
out of the box.

We can't delete our copy until LLVM 22 is our minimum supported version,
or we switch to the newer JITLink API for at least Aarch64.

[1] https://github.com/llvm/llvm-project/pull/71968
[2] https://github.com/llvm/llvm-project/pull/174307

Backpatch-through: 14
Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com

Further harden tests that might use not-so-compatible tar versions.

Buildfarm testing shows that OpenSUSE (and perhaps related platforms?)
configures GNU tar in such a way that it'll archive sparse WAL files
by default, thus triggering the pax-extension detection code added by
bc30c704a.  Thus, we need something similar to 852de579a but for
GNU tar's option set.  "--format=ustar" seems to do the trick.

Moreover, the buildfarm shows that pg_verifybackup's 003_corruption.pl
test script is also triggering creation of pax-format tar files on
that platform.  We had not noticed because those test cases all fail
(intentionally) before getting to the point of trying to verify WAL
data.

Since that means two TAP scripts need this option-selection logic, and
plausibly more will do so in future, factor it out into a subroutine
in Test::Utils.  We also need to back-patch the 003_corruption.pl fix
into v18, where it's also failing.

While at it, clean up some places where guards for $tar being empty
or undefined were incomplete or even outright backwards.  Presumably,
we missed noticing because the set of machines that run TAP tests
and don't have tar installed is empty.  But if we're going to try
to handle that scenario, we should do it correctly.

Reported-by: Tomas Vondra <tomas@vondra.me>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/02770bea-b3f3-4015-8a43-443ae345379c@vondra.me
Backpatch-through: 18

Harden astreamer tar parsing logic against archives it can't handle.

Previously, there was essentially no verification in this code that
the input is a tar file at all, let alone that it fits into the
subset of valid tar files that we can handle.  This was exposed by
the discovery that we couldn't handle files that FreeBSD's tar
makes, because it's fairly aggressive about converting sparse WAL
files into sparse tar entries.  To fix:

* Bail out if we find a pax extension header.  This covers the
sparse-file case, and also protects us against scenarios where
the pax header changes other file properties that we care about.
(Eventually we may extend the logic to actually handle such
headers, but that won't happen in time for v19.)

* Be more wary about tar file type codes in general: do not assume
that anything that's neither a directory nor a symlink must be a
regular file.  Instead, we just ignore entries that are none of the
three supported types.

* Apply pg_dump's isValidTarHeader to verify that a purported
header block is actually in tar format.  To make this possible,
move isValidTarHeader into src/port/tar.c, which is probably where
it should have been since that file was created.

I also took the opportunity to const-ify the arguments of
isValidTarHeader and tarChecksum, and to use symbols not hard-wired
constants inside tarChecksum.

Back-patch to v18 but not further.  Although this code exists inside
pg_basebackup in older branches, it's not really exposed in that
usage to tar files that weren't generated by our own code, so it
doesn't seem worth back-porting these changes across 3c9056981
and f80b09bac.  I did choose to include a back-patch of 5868372bb
into v18 though, to minimize cosmetic differences between these
two branches.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/3049460.1775067940@sss.pgh.pa.us>
Backpatch-through: 18

jit: Stop emitting lifetime.end for LLVM 22.

The lifetime.end intrinsic can now only be used for stack memory
allocated with alloca[1][2][3]. We use it to tell LLVM about the
lifetime of function arguments/isnull values that we keep in palloc'd
memory, so that it can avoid spilling registers to memory.

We might need to rearrange things and put them on the stack, but that'll
take some research. In the meantime, unbreak the build on LLVM 22.

[1] https://github.com/llvm/llvm-project/pull/149310
[2] https://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic
[3] https://llvm.org/docs/LangRef.html#i-alloca

Backpatch-through: 14
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com> (earlier attempt)
Reviewed-by: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> (earlier attempt)
Reviewed-by: Andres Freund <andres@anarazel.de> (earlier attempt)
Discussion: https://postgr.es/m/CA%2BhUKGJTumad75o8Zao-LFseEbt%3DenbUFCM7LZVV%3Dc8yg2i7dg%40mail.gmail.com

doc: Add missing description for DROP SUBSCRIPTION IF EXISTS.

Oversight in commit 665d1fad99.

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/CAHut%2BPv72haFerrCdYdmF6hu6o2jKcGzkXehom%2BsP-JBBmOVDg%40mail.gmail.com
Backpatch-through: 14

Be more careful to preserve consistency of a tuplestore.

Several places in tuplestore.c would leave the tuplestore data
structure effectively corrupt if some subroutine were to throw
an error.  Notably, if WRITETUP() failed after some number of
successful calls within dumptuples(), the tuplestore would
contain some memtuples pointers that were apparently live
entries but in fact pointed to pfree'd chunks.

In most cases this sort of thing is fine because transaction
abort cleanup is not too picky about the contents of memory that
it's going to throw away anyway.  There's at least one exception
though: if a Portal has a holdStore, we're going to call
tuplestore_end() on that, even during transaction abort.
So it's not cool if that tuplestore is corrupt, and that means
tuplestore.c has to be more careful.

This oversight demonstrably leads to crashes in v15 and before,
if a holdable cursor fails to persist its data due to an undersized
temp_file_limit setting.  Very possibly the same thing can happen in
v16 and v17 as well, though the specific test case submitted failed
to fail there (cf. 095555daf).  The failure is accidentally dodged
as of v18 because 590b045c3 got rid of tuplestore_end's retail tuple
deletion loop.  Still, it seems unwise to permit tuplestores to become
internally inconsistent in any branch, so I've applied the same fix
across the board.

Since the known test case for this is rather expensive and doesn't
fail in recent branches, I've omitted it.

Bug: #19438
Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org
Backpatch-through: 14

Detect pfree or repalloc of a previously-freed memory chunk.

Before the major rewrite in commit c6e0fe1f2, AllocSetFree() would
typically crash when asked to free an already-free chunk.  That was
an ugly but serviceable way of detecting coding errors that led to
double pfrees.  But since that rewrite, double pfrees went through
just fine, because the "hdrmask" of a freed chunk isn't changed at all
when putting it on the freelist.  We'd end with a corrupt freelist
that circularly links back to the doubly-freed chunk, which would
usually result in trouble later, far removed from the actual bug.

This situation is no good at all for debugging purposes.  Fortunately,
we can fix it at low cost in MEMORY_CONTEXT_CHECKING builds by making
AllocSetFree() check for chunk->requested_size == InvalidAllocSize,
relying on the pre-existing code that sets it that way just below.

I investigated the alternative of changing a freed chunk's methodid
field, which would allow detection in non-MEMORY_CONTEXT_CHECKING
builds too.  But that adds measurable overhead.  Seeing that we didn't
notice this oversight for more than three years, it's hard to argue
that detecting this type of bug is worth any extra overhead in
production builds.

Likewise fix AllocSetRealloc() to detect repalloc() on a freed chunk,
and apply similar changes in generation.c and slab.c.  (generation.c
would hit an Assert failure anyway, but it seems best to make it act
like aset.c.)  bump.c doesn't need changes since it doesn't support
pfree in the first place.  Ideally alignedalloc.c would receive
similar changes, but in debugging builds it's impossible to reach
AlignedAllocFree() or AlignedAllocRealloc() on a pfreed chunk, because
the underlying context's pfree would have wiped the chunk header of
the aligned chunk.  But that means we should get an error of some
sort, so let's be content with that.

Per investigation of why the test case for bug #19438 didn't appear to
fail in v16 and up, even though the underlying bug was still present.
(This doesn't fix the underlying double-free bug, just cause it to
get detected.)

Bug: #19438
Reported-by: Dmitriy Kuzmin <kuzmin.db4@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/19438-9d37b179c56d43aa@postgresql.org
Backpatch-through: 16

Fix FK triggers losing DEFERRABLE/INITIALLY DEFERRED when marked ENFORCED again

Previously, a foreign key defined as DEFERRABLE INITIALLY DEFERRED could
behave as NOT DEFERRABLE after being set to NOT ENFORCED and then back
to ENFORCED.

This happened because recreating the FK triggers on re-enabling the constraint
forgot to restore the tgdeferrable and tginitdeferred fields in pg_trigger.

Fix this bug by properly setting those fields when the foreign key constraint
is marked ENFORCED again and its triggers are recreated, so the original
DEFERRABLE and INITIALLY DEFERRED properties are preserved.

Backpatch to v18, where NOT ENFORCED foreign keys were introduced.

Author: Yasuo Honda <yasuo.honda@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAKmOUTms2nkxEZDdcrsjq5P3b2L_PR266Hv8kW5pANwmVaRJJQ@mail.gmail.com
Backpatch-through: 18

Fix datum_image_*()'s inability to detect sign-extension variations

Functions such as hash_numeric() are not careful to use the correct
PG_RETURN_*() macro according to the return type of that function as
defined in pg_proc.  Because that function is meant to return int32,
when the hashed value exceeds 2^31, the 64-bit Datum value won't wrap to
a negative number, which means the Datum won't have the same value as it
would have had it been cast to int32 on a two's complement machine.  This
isn't harmless as both datum_image_eq() and datum_image_hash() may receive
a Datum that's been formed and deformed from a tuple in some cases, and
not in other cases.  When formed into a tuple, the Datum value will be
coerced into an integer according to the attlen as specified by the
TupleDesc.  This can result in two Datums that should be equal being
classed as not equal, which could result in (but not limited to) an error
such as:

ERROR:  could not find memoization table entry

Here we fix this by ensuring we cast the Datum value to a signed integer
according to the typLen specified in the datum_image_eq/datum_image_hash
function call before comparing or hashing.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: Tender Wang <tndrwang@gmail.com>
Backpatch-through: 14
Discussion: https://postgr.es/m/CAHewXNmcXVFdB9_WwA8Ez0P+m_TQy_KzYk5Ri5dvg+fuwjD_yw@mail.gmail.com

Doc: fix stale text about partition locking with cached plans

Commit 121d774caea added text to master describing pruning-aware
locking behavior introduced by 525392d57. That behavior was
reverted in May 2025, making the text incorrect. Replace it with
the text used in back branches, which correctly describes current
behavior: pruned partitions are still locked at the beginning of
execution.

Discussion: https://postgr.es/m/CA+HiwqFT0fPPoYBr0iUFWNB-Og7bEXB9hB=6ogk_qD9=OM8Vbw@mail.gmail.com

Fix multiple bugs in astreamer pipeline code.

astreamer_tar_parser_content() sent the wrong data pointer when
forwarding MEMBER_TRAILER padding to the next streamer.  After
astreamer_buffer_until() buffers the padding bytes, the 'data'
pointer has been advanced past them, but the code passed 'data'
instead of bbs_buffer.data.  This caused the downstream consumer
to receive bytes from after the padding rather than the padding
itself, and could read past the end of the input buffer.

astreamer_gzip_decompressor_content() only checked for
Z_STREAM_ERROR from inflate(), silently ignoring Z_DATA_ERROR
(corrupted data) and Z_MEM_ERROR (out of memory).  Fix by
treating any return other than Z_OK, Z_STREAM_END, and
Z_BUF_ERROR as fatal.

astreamer_gzip_decompressor_free() missed calling inflateEnd() to
release zlib's internal decompression state.

astreamer_tar_parser_free() neglected to pfree() the streamer
struct itself, leaking it.

astreamer_extractor_content() did not check the return value of
fclose() when closing an extracted file.  A deferred write error
(e.g., disk full on buffered I/O) would be silently lost.

Discussion: https://postgr.es/m/results/98c6b630-acbb-44a7-97fa-1692ce2b827c@dunslane.net

Reviewed-By: Tom Lane <tgl@sss.pgh.pa.us>
Backpatch-through: 15

Avoid memory leak on error while parsing pg_stat_statements dump file

By using palloc() instead of raw malloc().

Reported-by: Gaurav Singh <gaurav.singh@yugabyte.com>
Reviewed-by: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://www.postgresql.org/message-id/CAEcQ1bYR9s4eQLFDjzzJHU8fj-MTbmRpW-9J-r2gsCn+HEsynw@mail.gmail.com
Backpatch-through: 14

Fix off-by-one error in read IO tracing

AsyncReadBuffer()'s no-IO needed path passed
TRACE_POSTGRESQL_BUFFER_READ_DONE the wrong block number because it had
already incremented operation->nblocks_done. Fix by folding the
nblocks_done offset into the blocknum local variable at initialization.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/u73un3xeljr4fiidzwi4ikcr6vm7oqugn4fo5vqpstjio6anl2%40hph6fvdiiria
Backpatch-through: 18

Fix premature NULL lag reporting in pg_stat_replication

pg_stat_replication is documented to keep the last measured lag values for
a short time after the standby catches up, and then set them to NULL when
there is no WAL activity. However, previously lag values could become NULL
prematurely even while WAL activity was ongoing, especially in logical
replication.

This happened because the code cleared lag when two consecutive reply messages
indicated that the apply location had caught up with the send location.
It did not verify that the reported positions were unchanged, so lag could be
cleared even when positions had advanced between messages. In logical
replication, where the apply location often quickly catches up, this issue was
more likely to occur.

This commit fixes the issue by clearing lag only when the standby reports that
it has fully replayed WAL (i.e., both flush and apply locations have caught up
with the send location) and the write/flush/apply positions remain unchanged
across two consecutive reply messages.

The second message with unchanged positions typically results from
wal_receiver_status_interval, so lag values are cleared after that interval
when there is no activity. This avoids showing stale lag data while preventing
premature NULL values.

Even with this fix, lag may rarely become NULL during activity if identical
position reports are sent repeatedly. Eliminating such duplicate messages
would address this fully, but that change is considered too invasive for stable
branches and will be handled in master only later.

Backpatch to all supported branches.

Author: Shinya Kato <shinya11.kato@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAOzEurTzcUrEzrH97DD7+Yz=HGPU81kzWQonKZvqBwYhx2G9_A@mail.gmail.com
Backpatch-through: 14

Prevent spurious "indexes on virtual generated columns are not supported".

Both of the checks in DefineIndex() that can produce this error
message have a guard against negative attribute numbers, but lack a
guard to ensure that attno is non-zero. As a result, we can index
off the beginning of the TupleDesc and read a garbage byte for
attgenerated. If that byte happens to be 'v', we'll incorrectly
produce the error mentioned above.

The first call site is easy to hit: any attempt to create an
expression index does so. The second one is not currently hit in
the regression tests, but can be hit by something like
CREATE INDEX ON some_table ((some_function(some_table))).

Found by study of a test_plan_advice failure on buildfarm member
skink, though this issue has nothing to do with test_plan_advice
and seems to have only been revealed by happenstance.

Backpatch-through: 18
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoacixUZVvi00hOjk_d9B4iYKswWP1gNqQ8Vfray-AcOCA@mail.gmail.com

Fix copy-paste error in test_ginpostinglist

The check for a mismatch on the second decoded item pointer
was an exact copy of the first item pointer check, comparing
orig_itemptrs[0] with decoded_itemptrs[0] instead of orig_itemptrs[1]
with decoded_itemptrs[1]. The error message also reported (0, 1) as
the expected value instead of (blk, off). As a result, any decoding
error in the second item pointer (where the varbyte delta encoding
is exercised) would go undetected.

This has been wrong since commit bde7493d1, so backpatch to all
supported versions.

Author: Jianghua Yang <yjhjstz@gmail.com>
Discussion: https://postgr.es/m/CAAZLFmSOD8R7tZjRLZsmpKtJLoqjgawAaM-Pne1j8B_Q2aQK8w@mail.gmail.com
Backpatch-through: 14

Further improve commentary about ChangeVarNodesWalkExpression()

The updated comment explains why we use ChangeVarNodes_walker() instead of
expression_tree_walker(), and provides a bit more detail about the differences
in processing top-level Query and subqueries.

Author: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAPpHfdvbjq342WTQ705Wmqhe8794pcp7wospz%2BWUJ2qB7vuOqA%40mail.gmail.com
Backpatch-through: 18

Improve commentary about ChangeVarNodesWalkExpression().

IMO the proximate cause of the bug fixed in commit 07b7a964d
was sloppy thinking about what ChangeVarNodesWalkExpression()
is to be used for. Flesh out its header comment to try to
improve that situation.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1607553.1774017006@sss.pgh.pa.us
Backpatch-through: 18

Fix multixact backwards-compatibility with CHECKPOINT race condition

If a CHECKPOINT record with nextMulti N is written to the WAL before
the CREATE_ID record for N, and N happens to be the first multixid on
an offset page, the backwards compatibility logic to tolerate WAL
generated by older minor versions (before commit 789d65364c) failed to
compensate for the missing XLOG_MULTIXACT_ZERO_OFF_PAGE record. In
that case, the latest_page_number was initialized at the start of WAL
replay to the page for nextMulti from the CHECKPOINT record, even if
we had not seen the CREATE_ID record for that multixid yet, which
fooled the backwards compatibility logic to think that the page was
already initialized.

To fix, track the last XLOG_MULTIXACT_ZERO_OFF_PAGE that we've seen
separately from latest_page_number. If we haven't seen any
XLOG_MULTIXACT_ZERO_OFF_PAGE records yet, use
SimpleLruDoesPhysicalPageExist() to check if the page needs to be
initialized.

Reported-by: duankunren.dkr <duankunren.dkr@alibaba-inc.com>
Analyzed-by: duankunren.dkr <duankunren.dkr@alibaba-inc.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/c4ef1737-8cba-458e-b6fd-4e2d6011e985.duankunren.dkr@alibaba-inc.com
Backpatch-through: 14-18

Fix invalid value of pg_aios.pid, function pg_get_aios()

When the value of pg_aios.pid is found to be 0, the function had the
idea to set "nulls" to "false" instead of "true", without setting the
value stored in the tuplestore. This could lead to the display of buggy
data. The intention of the code is clearly to display NULL when a PID
of 0 is found, and this commit adjusts the logic to do so.

Issue introduced by 60f566b4f243.

Author: ChangAo Chen <cca5507@qq.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/tencent_7D61A85D6143AD57CA8D8C00DEC541869D06@qq.com
Backpatch-through: 18

Fix finalization of decompressor astreamers.

Send the correct amount of data to the next astreamer, not the
whole allocated buffer size. This bug escaped detection because
in present uses the next astreamer is always a tar-file parser
which is insensitive to trailing garbage. But that may not
be true in future uses.

Author: Andrew Dunstan <andrew@dunslane.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2178517.1774064942@sss.pgh.pa.us
Backpatch-through: 15

Fix self-join removal to update bare Var references in join clauses

Self-join removal failed to update Var nodes when the join clause was a
bare Var (e.g., ON t1.bool_col) rather than an expression containing
Vars. ChangeVarNodesWalkExpression() used expression_tree_walker(),
which descends into child nodes but does not process the top-level node
itself. When a bare Var referencing the removed relation appeared as
the clause, its varno was left unchanged, leading to "no relation entry
for relid N" errors.

Fix by calling ChangeVarNodes_walker() directly instead of
expression_tree_walker(), so the top-level node is also processed.

Bug: #19435
Reported-by: Hang Ammmkilo <ammmkilo@163.com>
Author: Andrei Lepikhov <lepihov@gmail.com>
Co-authored-by: Tender Wang <tndrwang@gmail.com>
Co-authored-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/19435-3cc1a87f291129f1%40postgresql.org
Backpatch-through: 18

SET NOT NULL: Call object-alter hook only after the catalog change

... otherwise, the function invoked by the hook might consult the
catalog and not see that the new constraint exists.

This relies on set_attnotnull doing CommandCounterIncrement()
after successfully modifying the catalog.

Oversight in commit 14e87ffa5c54.

Author: Artur Zakirov <zaartur@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CAKNkYnxUPCJk-3Xe0A3rmCC8B8V8kqVJbYMVN6ySGpjs_qd7dQ@mail.gmail.com

Fix dependency on FDW handler.

ALTER FOREIGN DATA WRAPPER could drop the dependency on the handler
function if it wasn't explicitly specified.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/35c44a4b7fb76d35418c4d66b775a88f4ce60c86.camel@j-davis.com
Backpatch-through: 14

Fix WAL flush LSN used by logical walsender during shutdown

Commit 6eedb2a5fd8 made the logical walsender call
XLogFlush(GetXLogInsertRecPtr()) to ensure that all pending WAL is flushed,
fixing a publisher shutdown hang. However, if the last WAL record ends at
a page boundary, GetXLogInsertRecPtr() can return an LSN pointing past
the page header, which can cause XLogFlush() to report an error.

A similar issue previously existed in the GiST code. Commit b1f14c96720
introduced GetXLogInsertEndRecPtr(), which returns a safe WAL insertion end
location (returning the start of the page when the last record ends at a page
boundary), and updated the GiST code to use it with XLogFlush().

This commit fixes the issue by making the logical walsender use
XLogFlush(GetXLogInsertEndRecPtr()) when flushing pending WAL during shutdown.

Backpatch to all supported versions.

Reported-by: Andres Freund <andres@anarazel.de>
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/vzguaguldbcyfbyuq76qj7hx5qdr5kmh67gqkncyb2yhsygrdt@dfhcpteqifux
Backpatch-through: 14

Tighten asserts on ParallelWorkerNumber

The comment about ParallelWorkerNumbr in parallel.c says:

  In parallel workers, it will be set to a value >= 0 and < the number
  of workers before any user code is invoked; each parallel worker will
  get a different parallel worker number.

However asserts in various places collecting instrumentation allowed
(ParallelWorkerNumber == num_workers). That would be a bug, as the value
is used as index into an array with num_workers entries.

Fixed by adjusting the asserts accordingly. Backpatch to all supported
versions.

Discussion: https://postgr.es/m/5db067a1-2cdf-4afb-a577-a04f30b69167@vondra.me
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Backpatch-through: 14

Use GetXLogInsertEndRecPtr in gistGetFakeLSN

The function used GetXLogInsertRecPtr() to generate the fake LSN. Most
of the time this is the same as what XLogInsert() would return, and so
it works fine with the XLogFlush() call. But if the last record ends at
a page boundary, GetXLogInsertRecPtr() returns LSN pointing after the
page header. In such case XLogFlush() fails with errors like this:

ERROR: xlog flush request 0/01BD2018 is not satisfied --- flushed only to 0/01BD2000

Such failures are very hard to trigger, particularly outside aggressive
test scenarios.

Fixed by introducing GetXLogInsertEndRecPtr(), returning the correct LSN
without skipping the header. This is the same as GetXLogInsertRecPtr(),
except that it calls XLogBytePosToEndRecPtr().

Initial investigation by me, root cause identified by Andres Freund.

This is a long-standing bug in gistGetFakeLSN(), probably introduced by
c6b92041d38 in PG13. Backpatch to all supported versions.

Reported-by: Peter Geoghegan <pg@bowt.ie>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/vf4hbwrotvhbgcnknrqmfbqlu75oyjkmausvy66ic7x7vuhafx@e4rvwavtjswo
Backpatch-through: 14

xml2: Fix failure with xslt_process() under -fsanitize=undefined

The logic of xslt_process() has never considered the fact that
xsltSaveResultToString() would return NULL for an empty string (the
upstream code has always done so, with a string length of 0). This
would cause memcpy() to be called with a NULL pointer, something
forbidden by POSIX.

Like 46ab07ffda9d and similar fixes, this is backpatched down to all the
supported branches, with a test case to cover this scenario. An empty
string has been always returned in xml2 in this case, based on the
history of the module, so this is an old issue.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/c516a0d9-4406-47e3-9087-5ca5176ebcf9@gmail.com
Backpatch-through: 14

Prevent restore of incremental backup from bloating VM fork.

When I (rhaas) wrote the WAL summarizer code, I incorrectly believed
that XLOG_SMGR_TRUNCATE truncates all forks to the same length. In
fact, what other parts of the code do is compute the truncation length
for the FSM and VM forks from the truncation length used for the main
fork. But, because I was confused, I coded the WAL summarizer to set the
limit block for the VM fork to the same value as for the main fork.
(Incremental backup always copies FSM forks in full, so there is no
similar issue in that case.)

Doing that doesn't directly cause any data corruption, as far as I can
see. However, it does create a serious risk of consuming a large amount
of extra disk space, because pg_combinebackup's reconstruct.c believes
that the reconstructed file should always be at least as long as the
limit block value. We might want to be smarter about that at some point
in the future, because it's always safe to omit all-zeroes blocks at the
end of the last segment of a relation, and doing so could save disk
space, but the current algorithm will rarely waste enough disk space to
worry about unless we believe that a relation has been truncated to a
length much longer than its actual length on disk, which is exactly what
happens as a result of the problem mentioned in the previous paragraph.

To fix, create a new visibilitymap helper function and use it to include
the right limit block in the summary files. Incremental backups taken
with existing summary files will still have this issue, but this should
improve the situation going forward.

Diagnosed-by: Oleg Tkachenko <oatkachenko@gmail.com>
Diagnosed-by: Amul Sul <sulamul@gmail.com>
Discussion: http://postgr.es/m/CAAJ_b97PqG89hvPNJ8cGwmk94gJ9KOf_pLsowUyQGZgJY32o9g@mail.gmail.com
Discussion: http://postgr.es/m/6897DAF7-B699-41BF-A6FB-B818FCFFD585%40gmail.com
Backpatch-through: 17

doc: Document IF NOT EXISTS option for ALTER FOREIGN TABLE ADD COLUMN.

Commit 2cd40adb85d added the IF NOT EXISTS option to ALTER TABLE ADD COLUMN.
This also enabled IF NOT EXISTS for ALTER FOREIGN TABLE ADD COLUMN,
but the ALTER FOREIGN TABLE documentation was not updated to mention it.

This commit updates the documentation to describe the IF NOT EXISTS option for
ALTER FOREIGN TABLE ADD COLUMN.

While updating that section, also this commit clarifies that the COLUMN keyword
is optional in ALTER FOREIGN TABLE ADD/DROP COLUMN. Previously, part of
the documentation could be read as if COLUMN were required.

This commit adds regression tests covering these ALTER FOREIGN TABLE syntaxes.

Backpatch to all supported versions.

Suggested-by: Fujii Masao <masao.fujii@gmail.com>
Author: Chao Li <lic@highgo.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwFk=rrhrwGwPtQxBesbT4DzSZ86Q3ftcwCu3AR5bOiXLw@mail.gmail.com
Backpatch-through: 14

Fix size underestimation of DSA pagemap for odd-sized segments

When make_new_segment() creates an odd-sized segment, the pagemap was
only sized based on a number of usable_pages entries, forgetting that a
segment also contains metadata pages, and that the FreePageManager uses
absolute page indices that cover the entire segment.  This
miscalculation could cause accesses to pagemap entries to be out of
bounds.  During subsequent reuse of the allocated segment, allocations
landing on pages with indices higher than usable_pages could cause
out-of-bounds pagemap reads and/or writes.  On write, 'span' pointers
are stored into the data area, corrupting the allocated objects.  On
read (aka during a dsa_free), garbage is interpreted as a span pointer,
typically crashing the server in dsa_get_address().

The normal geometric path correctly sizes the pagemap for all pages in
the segment.  The odd-sized path needs to do the same, but it works
forward from usable_pages rather than backward from total_size.

This commit fixes the sizing of the odd-sized case by adding pagemap
entries for the metadata pages after the initial metadata_bytes
calculation, using an integer ceiling division to compute the exact
number of additional entries needed in one go, avoiding any iteration in
the calculation.

An assertion is added in the code path for odd-sized segments, ensuring
that the pagemap includes the metadata area, and that the result is
appropriately sized.

This problem would show up depending on the size requested for the
allocation of a DSA segment.  The reporter has noticed this issue when a
parallel hash join makes a DSA allocation large enough to trigger the
odd-sized segment path, but it could happen for anything that does a DSA
allocation.

A regression test is added to test_dsa, down to v17 where the test
module has been introduced.  This adds a set of cheap tests to check the
problem, the new assertion being useful for this purpose.  Sami has
proposed a test that took a longer time than what I have done here; the
test committed is faster and good enough to check the odd-sized
allocation path.

Author: Paul Bunn <paul.bunn@icloud.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/044401dcabac$fe432490$fac96db0$@icloud.com
Backpatch-through: 14

Fix invalid boolean if-test

We were testing the truth value of the array of booleans (which is
always true) instead of the boolean element specific to the affected
table column.

This causes a binary-upgrade dump fail to omit the name of a constraint;
that is, the correct constraint name is always printed, even when it's
not needed.  The affected case is a binary-upgrade dump of a not-null
constraint in an inherited column, which must in addition have no
comment.

Another point is that in order for this to make a difference, the
constraint must have the default name in the child table.  That is, the
constraint must have been created _in the parent table_ with the name
that it would have in the child table, like so:
  CREATE TABLE parent (a int CONSTRAINT child_a_not_null NOT NULL);
  CREATE TABLE child () INHERITS (parent);
Otherwise, the correct name must be printed by binary-upgrade pg_dump
anyway, since it wouldn't match the name produced at the parent.

Moreover, when it does hit, the pre-18-compatibility code (which has to
work with a constraint that has no name) gets involved and uses the
UPDATE on pg_constraint using the conkey instead of column name ... and
so everything ends up working correctly AFAICS.

I think it might cause a problem if the table and column names are
overly long, but I didn't want to spend time investigating further.

Still, it's wrong code, and static analyzers have twice complained about
it, so fix it by adding the array index accessor that was obviously
meant.

Reported-by: Ranier Vilela <ranier.vf@gmail.com>
Reported-by: George Tarasov <george.v.tarasov@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CAEudQAo7ah=4TDheuEjtb0dsv6bHoK7uBNqv53Tsub2h-xBSJw@mail.gmail.com
Discussion: https://postgr.es/m/f3029f25-acc9-4cb9-a74f-fe93bcfb3a27@gmail.com

Fix publisher shutdown hang caused by logical walsender busy loop.

Previously, when logical replication was running, shutting down
the publisher could cause the logical walsender to enter a busy loop
and prevent the publisher from completing shutdown.

During shutdown, the logical walsender waits for all pending WAL
to be written out. However, some WAL records could remain unflushed,
causing the walsender to wait indefinitely.

The issue occurred because the walsender used XLogBackgroundFlush() to
flush pending WAL. This function does not guarantee that all WAL is written.
For example, WAL generated by a transaction without an assigned
transaction ID that aborts might not be flushed.

This commit fixes the bug by making the logical walsender call XLogFlush()
instead, ensuring that all pending WAL is written and preventing
the busy loop during shutdown.

Backpatch to all supported versions.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAO6_Xqo3co3BuUVEVzkaBVw9LidBgeeQ_2hfxeLMQcXwovB3GQ@mail.gmail.com
Backpatch-through: 14

Exit after fatal errors in client-side compression code.

It looks like whoever wrote the astreamer (nee bbstreamer) code
thought that pg_log_error() is equivalent to elog(ERROR), but
it's not; it just prints a message. So all these places tried to
continue on after a compression or decompression error return,
with the inevitable result being garbage output and possibly
cascading error messages. We should use pg_fatal() instead.

These error conditions are probably pretty unlikely in practice,
which no doubt accounts for the lack of field complaints.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/1531718.1772644615@sss.pgh.pa.us
Backpatch-through: 15

Fix handling of updated tuples in the MERGE statement

This branch missed the IsolationUsesXactSnapshot() check.  That led to EPQ on
repeatable read and serializable isolation levels.  This commit fixes the
issue and provides a simple isolation check for that.  Backpatch through v15
where MERGE statement was introduced.

Reported-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/CAPpHfdvzZSaNYdj5ac-tYRi6MuuZnYHiUkZ3D-AoY-ny8v%2BS%2Bw%40mail.gmail.com
Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Backpatch-through: 15

doc: Clarify that COLUMN is optional in ALTER TABLE ... ADD/DROP COLUMN.

In ALTER TABLE ... ADD/DROP COLUMN, the COLUMN keyword is optional. However,
part of the documentation could be read as if COLUMN were required, which may
mislead users about the command syntax.

This commit updates the ALTER TABLE documentation to clearly state that
COLUMN is optional for ADD and DROP.

Also this commit adds regression tests covering ALTER TABLE ... ADD/DROP
without the COLUMN keyword.

Backpatch to all supported versions.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAEoWx2n6ShLMOnjOtf63TjjgGbgiTVT5OMsSOFmbjGb6Xue1Bw@mail.gmail.com
Backpatch-through: 14

Fix rare instability in recovery TAP test 004_timeline_switch

This fixes a problem similar to ad8c86d22cbd.  In this case, the test
could fail under the following circumstances:
- The primary is stopped with teardown_node(), meaning that it may not
be able to send all its WAL records to standby_1 and standby_2.
- If standby_2 receives more records than standby_1, attempting to
reconnect standby_2 to the promoted standby_1 would fail because of a
timeline fork.

This race condition is fixed with a simple trick: instead of tearing
down the primary, it is stopped cleanly so as all the WAL records of the
primary are received and flushed by both standby_1 and standby_2.  Once
we do that, there is no need for a wait_for_catchup() before stopping
the node.  The test wants to check that a timeline jump can be achieved
when reconnecting a standby to a promoted standby in the same cluster,
hence an immediate stop of the primary is not required.

This failure is harder to reach than the previous instability of
009_twophase, still the buildfarm has been able to detect this failure
at least once.  I have tried Alexander Lakhin's test trick with the
bgwriter and very aggressive standby snapshots, but I could not
reproduce it directly.  It is reachable, as the buildfarm has proved.

Backpatch down to all supported branches, and this problem can lead to
spurious failures in the buildfarm.

Discussion: https://postgr.es/m/493401a8-063f-436a-8287-a235d9e065fc@gmail.com
Backpatch-through: 14

Fix yet another bug in archive streamer with LZ4 decompression.

The code path in astreamer_lz4_decompressor_content() that updated
the output pointers when the output buffer isn't full was wrong.
It advanced next_out by bytes_written, which could include previous
decompression output not just that of the current cycle. The
correct amount to advance is out_size. While at it, make the
output pointer updates look more like the input pointer updates.

This bug is pretty hard to reach, as it requires consecutive
compression frames that are too small to fill the output buffer.
pg_dump could have produced such data before 66ec01dc4, but
I'm unsure whether any files we use astreamer with would be
likely to contain problematic data.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0594CC79-1544-45DD-8AA4-26270DE777A7@gmail.com
Backpatch-through: 15

Don't malloc(0) in EventTriggerCollectAlterTSConfig

Author: Florin Irion <florin.irion@enterprisedb.com>
Discussion: https://postgr.es/m/c6fff161-9aee-4290-9ada-71e21e4d84de@gmail.com

Add test for row-locking and multixids with prepared transactions

This is a repro for the issue fixed in commit ccae90abdb. Backpatch to
v17 like that commit, although that's a little arbitrary as this test
would work on older versions too.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAA5RZ0twq5bNMq0r0QNoopQnAEv+J3qJNCrLs7HVqTEntBhJ=g@mail.gmail.com
Backpatch-through: 17

Skip prepared_xacts test if max_prepared_transactions < 2

This reduces maintenance overhead, as we no longer need to update the
dummy expected output file every time the .sql file changes.

Discussion: https://www.postgresql.org/message-id/1009073.1772551323@sss.pgh.pa.us
Backpatch-through: 14

Fix rare instability in recovery TAP test 009_twophase

The phase of the test where we want to check that 2PC transactions
prepared on a primary can be committed on a promoted standby relied on
an immediate stop of the primary.  This logic has a race condition: it
could be possible that some records (most likely standby snapshot
records) are generated on the primary before it finishes its shutdown,
without the promoted standby know about them.  When the primary is
recycled as new standby, the test could fail because of a timeline fork
as an effect of these extra records.

This fix takes care of the instability by doing a clean stop of the
primary instead of a teardown (aka immediate stop), so as all records
generated on the primary are sent to the promoted standby and flushed
there.  There is no need for a teardown of the primary in this test
scenario: the commit of 2PC transactions on a promoted standby do not
care about the state of the primary, only of the standby.

This race is very hard to hit in practice, even slow buildfarm members
like skink have a very low rate of reproduction.  Alexander Lakhin has
come up with a recipe to improve the reproduction rate a lot:
- Enable -DWAL_DEBUG.
- Patch the bgwriter so as standby snapshots are generated every
milliseconds.
- Run 009_twophase tests under heavy parallelism.

With this method, the failure appears after a couple of iterations.
With the fix in place, I have been able to run more than 50 iterations
of the parallel test sequence, without seeing a failure.

Issue introduced in 30820982b295, due to a copy-pasto coming from the
surrounding tests.  Thanks also to Hayato Kuroda for digging into the
details of the failure.  He has proposed a fix different than the one of
this commit.  Unfortunately, it relied on injection points, feature only
available in v17.  The solution of this commit is simpler, and can be
applied to v14~v16.

Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/b0102688-6d6c-c86a-db79-e0e91d245b1a@gmail.com
Backpatch-through: 14

doc: Fix sentence of pg_walsummary page

Author: Peter Smith <smithpb2250@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Discussion: https://postgr.es/m/CAHut+PvfYBL-ppX-i8DPeRu7cakYCZz+QYBhrmQzicx7z_Tj5w@mail.gmail.com
Backpatch-through: 17

doc: Clarify that empty COMMENT string removes the comment.

Clarify the documentation of COMMENT ON to state that specifying an empty
string is treated as NULL, meaning that the comment is removed.

This makes the behavior explicit and avoids possible confusion about how
empty strings are handled.

Also adds regress test cases that use empty string to remove a comment.

Backpatch to all supported versions.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: David G. Johnston <david.g.johnston@gmail.com>
Reviewed-by: Shengbin Zhao <zshengbin91@gmail.com>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: zhangqiang <zhang_qiang81@163.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/26476097-B1C1-4BA8-AA92-0AD0B8EC7190@gmail.com
Backpatch-through: 14

basic_archive: Allow archive directory to be missing at startup.

Presently, the GUC check hook for basic_archive.archive_directory
checks that the specified directory exists.  Consequently, if the
directory does not exist at server startup, archiving will be stuck
indefinitely, even if it appears later.  To fix, remove this check
from the hook so that archiving will resume automatically once the
directory is present.  basic_archive must already be prepared to
deal with the directory disappearing at any time, so no additional
special handling is required.

Reported-by: Олег Самойлов <splarv@ya.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Discussion: https://postgr.es/m/73271769675212%40mail.yandex.ru
Backpatch-through: 15

Fix OldestMemberMXactId and OldestVisibleMXactId array usage

Commit ab355e3a88 changed how the OldestMemberMXactId array is
indexed. It's no longer indexed by synthetic dummyBackendId, but with
ProcNumber. The PGPROC entries for prepared xacts come after auxiliary
processes in the allProcs array, which rendered the calculation for
MaxOldestSlot and the indexes into the array incorrect. (The
OldestVisibleMXactId array is not used for prepared xacts, and thus
never accessed with ProcNumber's greater than MaxBackends, so this
only affects the OldestMemberMXactId array.)

As a result, a prepared xact would store its value past the end of the
OldestMemberMXactId array, overflowing into the OldestVisibleMXactId
array. That could cause a transaction's row lock to appear invisible
to other backends, or other such visibility issues. With a very small
max_connections setting, the store could even go beyond the
OldestVisibleMXactId array, stomping over the first element in the
BufferDescriptor array.

To fix, calculate the array sizes more precisely, and introduce helper
functions to calculate the array indexes correctly.

Author: Yura Sokolov <y.sokolov@postgrespro.ru>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/7acc94b0-ea82-4657-b1b0-77842cb7a60c@postgrespro.ru
Backpatch-through: 17

In pg_dumpall, don't skip role GRANTs with dangling grantor OIDs.

In commits 29d75b25b et al, I made pg_dumpall's dumpRoleMembership
logic treat a dangling grantor OID the same as dangling role and
member OIDs: print a warning and skip emitting the GRANT.  This wasn't
terribly well thought out; instead, we should handle the case by
emitting the GRANT without the GRANTED BY clause.  When the source
database is pre-v16, such cases are somewhat expected because those
versions didn't prevent dropping the grantor role; so don't even
print a warning that we did this.  (This change therefore restores
pg_dumpall's pre-v16 behavior for these cases.)  The case is not
expected in >= v16, so then we do print a warning, but soldiering on
with no GRANTED BY clause still seems like a reasonable strategy.

Per complaint from Robert Haas that we were now dropping GRANTs
altogether in easily-reachable scenarios.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CA+TgmoauoiW4ydDhdrseg+DD4Kwha=+TSZp18BrJeHKx3o1Fdw@mail.gmail.com
Backpatch-through: 16

Fix memory allocation size in RegisterExtensionExplainOption()

The allocations used for the static array ExplainExtensionOptionArray,
that tracks a set of ExplainExtensionOption, used "char *" instead of
ExplainExtensionOption as the memory size consumed by one element,
underestimating the memory required by half.

The initial allocation of ExplainExtensionNameArray wants to hold 16
elements before being reallocated, and with "char *" it meant that there
was enough space only for 8 ExplainExtensionOption elements, 16 bytes
required for each element. The backend would crash once one tries to
register a 9th EXPLAIN option.

As far as I can see, the allocation formulas of GetExplainExtensionId()
have been copy-pasted to RegisterExtensionExplainOption(), but the
internal maths of the copy were not adjusted accordingly.

Oversight in c65bc2e1d14a.

Author: Joel Jacobson <joel@compiler.org>
Discussion: https://postgr.es/m/2a4bd2f5-2a2f-409f-8ac7-110dd3fad4fc@app.fastmail.com
Backpatch-through: 18

test_custom_types: Test module with fancy custom data types

This commit adds a new test module called "test_custom_types", that can
be used to stress code paths related to custom data type
implementations.

Currently, this is used as a test suite to validate the set of fixes
done in 3b7a6fa15720, that requires some typanalyze callbacks that can
force very specific backend behaviors, as of:
- typanalyze callback that returns "false" as status, to mark a failure
in computing statistics.
- typanalyze callback that returns "true" but let's the backend know
that no interesting stats could be computed, with stats_valid set to
"false".

This could be extended more in the future if more problems are found.
For simplicity, the module uses a fake int4 data type, that requires a
btree operator class to be usable with extended statistics. The type is
created by the extension, and its properties are altered in the test.

Like 3b7a6fa15720, this module is backpatched down to v14, for coverage
purposes.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz
Backpatch-through: 14

Fix set of issues with extended statistics on expressions

This commit addresses two defects regarding extended statistics on
expressions:
- When building extended statistics in lookup_var_attr_stats(), the call
to examine_attribute() did not account for the possibility of a NULL
return value.  This can happen depending on the behavior of a typanalyze
callback — for example, if the callback returns false, if no rows are
sampled, or if no statistics are computed.  In such cases, the code
attempted to build MCV, dependency, and ndistinct statistics using a
NULL pointer, incorrectly assuming valid statistics were available,
which could lead to a server crash.
- When loading extended statistics for expressions,
statext_expressions_load() did not account for NULL entries in the
pg_statistic array storing expression statistics.  Such NULL entries can
be generated when statistics collection fails for an expression, as may
occur during the final step of serialize_expr_stats().  An extended
statistics object defining N expressions requires N corresponding
elements in the pg_statistic array stored for the expressions, and some
of these elements can be NULL.  This situation is reachable when a
typanalyze callback returns true, but sets stats_valid to indicate that
no useful statistics could be computed.

While these scenarios cannot occur with in-core typanalyze callbacks, as
far as I have analyzed, they can be triggered by custom data types with
custom typanalyze implementations, at least.

No tests are added in this commit.  A follow-up commit will introduce a
test module that can be extended to cover similar edge cases if
additional issues are discovered.  This takes care of the core of the
problem.

Attribute and relation statistics already offer similar protections:
- ANALYZE detects and skips the build of invalid statistics.
- Invalid catalog data is handled defensively when loading statistics.

This issue exists since the support for extended statistics on
expressions has been added, down to v14 as of a4d75c86bf15.  Backpatch
to all supported stable branches.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aaDrJsE1I5mrE-QF@paquier.xyz
Backpatch-through: 14

Don't flatten join alias Vars that are stored within a GROUP RTE.

The RTE's groupexprs list is used for deparsing views, and for that
usage it must contain the original alias Vars; else we can get
incorrect SQL output.  But since commit 247dea89f,
parseCheckAggregates put the GROUP BY expressions through
flatten_join_alias_vars before building the RTE_GROUP RTE.
Changing the order of operations there is enough to fix it.

This patch unfortunately can do nothing for already-created views:
if they use a coding pattern that is subject to the bug, they will
deparse incorrectly and hence present a dump/reload hazard in the
future.  The only fix is to recreate the view from the original SQL.
But the trouble cases seem to be quite narrow.  AFAICT the output
was only wrong for "SELECT ... t1 LEFT JOIN t2 USING (x) GROUP BY x"
where t1.x and t2.x were not of identical data types and t1.x was
the side that required an implicit coercion.  If there was no hidden
coercion, or if the join was plain, RIGHT, or FULL, the deparsed
output was uglier than intended but not functionally wrong.

Reported-by: Swirl Smog Dowry <swirl-smog-dowry@duck.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/CA+-gibjCg_vjcq3hWTM0sLs3_TUZ6Q9rkv8+pe2yJrdh4o4uoQ@mail.gmail.com
Backpatch-through: 18

postgres_fdw: Fix thinko in comment for UserMappingPasswordRequired().

This commit also rephrases this comment to improve readability.

Oversight in commit 6136e94dc.

Reported-by: Etsuro Fujita <etsuro.fujita@gmail.com>
Author: Andreas Karlsson <andreas@proxel.se>
Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com>
Discussion: https://postgr.es/m/CAPmGK16pDnM_wU3kmquPj-M9MYqG3y0BdntRZ0eytqbCaFY3WQ%40mail.gmail.com
Backpatch-through: 14

Yet another ltree fix for REL_18_STABLE.

Fix buildfarm failure from code that's only present in version 18,
introduced by commit b3c2a3d386.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/739010.1772142042@sss.pgh.pa.us

Fix more multibyte issues in ltree.

Commit 84d5efa7e3 missed some multibyte issues caused by short-circuit
logic in the callers. The callers assumed that if the predicate string
is longer than the label string, then it couldn't possibly be a match,
but it can be when using case-insensitive matching (LVAR_INCASE) if
casefolding changes the byte length.

Fix by refactoring to get rid of the short-circuit logic as well as
the function pointer, and consolidate the logic in a replacement
function ltree_label_match().

Discussion: https://postgr.es/m/02c6ef6cf56a5013ede61ad03c7a26affd27d449.camel@j-davis.com
Backpatch-through: 14

Fix memory leaks in pg_locale_icu.c.

The backport prior to 18 requires minor modification due to code
refactoring.

Discussion: https://postgr.es/m/e2b7a0a88aaadded7e2d19f42d5ab03c9e182ad8.camel@j-davis.com
Backpatch-through: 16

pg_dump: Preserve NO INHERIT on NOT NULL on inheritance children

When the constraint is printed without the column, we were not printing
the NO INHERIT flag.

Author: Jian He <jian.universality@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/CACJufxEDEOO09G+OQFr=HmFr9ZDLZbRoV7+pj58h3_WeJ_K5UQ@mail.gmail.com

EUC_CN, EUC_JP, EUC_KR, EUC_TW: Skip U+00A0 tests instead of failing.

Settings that ran the new test euc_kr.sql to completion would fail these
older src/pl tests.  Use alternative expected outputs, for which psql
\gset and \if have reduced the maintenance burden.  This fixes
"LANG=ko_KR.euckr LC_MESSAGES=C make check-world".  (LC_MESSAGES=C fixes
IO::Pty usage in tests 010_tab_completion and 001_password.)  That file
is new in commit c67bef3f3252a3a38bf347f9f119944176a796ce.  Back-patch
to v14, like that commit.

Discussion: https://postgr.es/m/20260217184758.da.noahmisch@microsoft.com
Backpatch-through: 14

doc: Clarify INCLUDING COMMENTS behavior in CREATE TABLE LIKE.

The documentation for the INCLUDING COMMENTS option of the LIKE clause
in CREATE TABLE was inaccurate and incomplete. It stated that comments for
copied columns, constraints, and indexes are copied, but regarding comments
on constraints in reality only comments on CHECK and NOT NULL constraints
are copied; comments on other constraints (such as primary keys) are not.
In addition, comments on extended statistics are copied, but this was not
documented.

The CREATE FOREIGN TABLE documentation had a similar omission: comments
on extended statistics are also copied, but this was not mentioned.

This commit updates the documentation to clarify the actual behavior.
The CREATE TABLE reference now specifies that comments on copied columns,
CHECK constraints, NOT NULL constraints, indexes, and extended statistics are
copied. The CREATE FOREIGN TABLE reference now notes that comments on
extended statistics are copied as well.

Backpatch to all supported versions. Documentation updates related to
CREATE FOREIGN TABLE LIKE and NOT NULL constraint comment copying are
not applied to v17 and earlier, since those features were introduced in v18.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwHSOSGcaYDvHF8EYCUCfGPjbRwGFsJ23cx5KbJ1X6JouQ@mail.gmail.com
Backpatch-through: 14

Fix ProcWakeup() resetting wrong waitStart field.

Previously, when one process woke another that was waiting on a lock,
ProcWakeup() incorrectly cleared its own waitStart field (i.e.,
MyProc->waitStart) instead of that of the process being awakened.
As a result, the awakened process retained a stale lock-wait start timestamp.

This did not cause user-visible issues. pg_locks.waitstart was reported as
NULL for the awakened process (i.e., when pg_locks.granted is true),
regardless of the waitStart value.

This bug was introduced by commit 46d6e5f56790.

This commit fixes this by resetting the waitStart field of the process
being awakened in ProcWakeup().

Backpatch to all supported branches.

Reported-by: Chao Li <li.evan.chao@gmail.com>
Author: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: ji xu <thanksgreed@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/537BD852-EC61-4D25-AB55-BE8BE46D07D7@gmail.com
Backpatch-through: 14

Allow PG_PRINTF_ATTRIBUTE to be different in C and C++ code.

Although clang claims to be compatible with gcc's printf format
archetypes, this appears to be a falsehood: it likes __syslog__
(which gcc does not, on most platforms) and doesn't accept
gnu_printf. This means that if you try to use gcc with clang++
or clang with g++, you get compiler warnings when compiling
printf-like calls in our C++ code. This has been true for quite
awhile, but it's gotten more annoying with the recent appearance
of several buildfarm members that are configured like this.

To fix, run separate probes for the format archetype to use with the
C and C++ compilers, and conditionally define PG_PRINTF_ATTRIBUTE
depending on __cplusplus.

(We could alternatively insist that you not mix-and-match C and
C++ compilers; but if the case works otherwise, this is a poor
reason to insist on that.)

This commit back-patches 0909380e4 into supported branches.

Discussion: https://postgr.es/m/986485.1764825548@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18

Fix some cases of indirectly casting away const.

Newest versions of gcc+glibc are able to detect cases where code
implicitly casts away const by assigning the result of strchr() or
a similar function applied to a "const char *" value to a target
variable that's just "char *".  This of course creates a hazard of
not getting a compiler warning about scribbling on a string one was
not supposed to, so fixing up such cases is good.

This patch fixes a dozen or so places where we were doing that.
Most are trivial additions of "const" to the target variable,
since no actually-hazardous change was occurring.

Thanks to Bertrand Drouvot for finding a couple more spots than
I had.

This commit back-patches relevant portions of 8f1791c61 and
9f7565c6c into supported branches.  However, there are two
places in ecpg (in v18 only) where a proper fix is more
complicated than seems appropriate for a back-patch.  I opted
to silence those two warnings by adding casts.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us
Discussion: https://postgr.es/m/3988414.1771950285@sss.pgh.pa.us
Backpatch-through: 14-18

Stabilize output of new isolation test insert-conflict-do-update-4.

The test added by commit 4b760a181 assumed that a table's physical
row order would be predictable after an UPDATE.  But a non-heap table
AM might produce some other order.  Even with heap AM, the assumption
seems risky; compare a3fd53bab for instance.  Adding an ORDER BY is
cheap insurance and doesn't break any goal of the test.

Author: Pavel Borisov <pashkin.elfe@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CALT9ZEHcE6tpvumScYPO6pGk_ASjTjWojLkodHnk33dvRPHXVw@mail.gmail.com
Backpatch-through: 14

Fix unsafe RTE_GROUP removal in simplify_EXISTS_query

When simplify_EXISTS_query removes the GROUP BY clauses from an EXISTS
subquery, it previously deleted the RTE_GROUP RTE directly from the
subquery's range table.

This approach is dangerous because deleting an RTE from the middle of
the rtable list shifts the index of any subsequent RTE, which can
silently corrupt any Var nodes in the query tree that reference those
later relations.  (Currently, this direct removal has not caused
problems because the RTE_GROUP RTE happens to always be the last entry
in the rtable list.  However, relying on that is extremely fragile and
seems like trouble waiting to happen.)

Instead of deleting the RTE_GROUP RTE, this patch converts it in-place
to be RTE_RESULT type and clears its groupexprs list.  This preserves
the length and indexing of the rtable list, ensuring all Var
references remain intact.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3472344.1771858107@sss.pgh.pa.us
Backpatch-through: 18

pg_upgrade: Use max_protocol_version=3.0 for older servers

The grease patch in 4966bd3ed found its first problem: prior to the
February 2018 patch releases, no server knew how to negotiate protocol
versions, so pg_upgrade needs to take that into account when speaking to
those older servers.

This will be true even after the grease feature is reverted; we don't
need anyone to trip over this again in the future. Backpatch so that all
supported versions of pg_upgrade can gracefully handle an update to the
default protocol version. (This is needed for any distributions that
link older binaries against newer libpqs, such as Debian.) Branches
prior to 18 need an additional version check, for the existence of
max_protocol_version.

Per buildfarm member crake.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAOYmi%2B%3D4QhCjssfNEoZVK8LPtWxnfkwT5p-PAeoxtG9gpNjqOQ%40mail.gmail.com
Backpatch-through: 14

Stamp 18.3.

Translation updates

Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 8edd578f1856e2ac142bb3bb7090ec0a58cd8ac6

Release notes for 18.3, 17.9, 16.13, 15.17, 14.22.

Update .abi-compliance-history for AddRelationNotNullConstraints().

Commit 8d9a97e0bb6d anticipated this change:

1 function with incompatible sub-type changes:

[C] 'function List* AddRelationNotNullConstraints(Relation, List*, List*)' has some sub-type changes:
parameter 4 of type 'List*' was added

Discussion: https://postgr.es/m/19393-6a82427485a744cf@postgresql.org
Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2026-02-21%2011%3A27%3A03
Backpatch-through: 18 only

Avoid name collision with NOT NULL constraints

If a CREATE TABLE statement defined a constraint whose name is identical
to the name generated for a NOT NULL constraint, we'd throw an
(unnecessary) unique key violation error on
pg_constraint_conrelid_contypid_conname_index: this can easily be
avoided by choosing a different name for the NOT NULL constraint.

Fix by passing the constraint names already created by
AddRelationNewConstraints() to AddRelationNotNullConstraints(), so that
the latter can avoid name collisions with them.

Bug: #19393
Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reported-by: Hüseyin Demir <huseyin.d3r@gmail.com>
Backpatch-through: 18
Discussion: https://postgr.es/m/19393-6a82427485a744cf@postgresql.org

First-draft release notes for 18.3.

As usual, the release notes for other branches will be made by cutting
these down, but put them up for community review first.

Fix expanding 'bounds' in pg_trgm's calc_word_similarity() function

If the 'bounds' array needs to be expanded, because the input contains
more trigrams than the initial guess, the code didn't return the
reallocated array correctly to the caller. That could lead to a crash
in the rare case that the input string becomes longer when it's
lower-cased. The only known instance of that is when an ICU locale is
used with certain single-byte encodings. This was an oversight in
commit 00896ddaf41f.

Author: Zsolt Parragi <zsolt.parragi@percona.com>
Backpatch-through: 18

Fix computation of varnullingrels when translating appendrel Var

When adjust_appendrel_attrs translates a Var referencing a parent
relation into a Var referencing a child relation, it propagates
varnullingrels from the parent Var to the translated Var.  Previously,
the code simply overwrote the translated Var's varnullingrels with
those of the parent.

This was incorrect because the translated Var might already possess
nonempty varnullingrels.  This happens, for example, when a LATERAL
subquery within a UNION ALL references a Var from the nullable side of
an outer join.  In such cases, the translated Var correctly carries
the outer join's relid in its varnullingrels.  Overwriting these bits
with the parent Var's set caused the planner to lose track of the fact
that the Var could be nulled by that outer join.

In the reported case, because the underlying column had a NOT NULL
constraint, the planner incorrectly deduced that the Var could never
be NULL and discarded essential IS NOT NULL filters.  This led to
incorrect query results where NULL rows were returned instead of being
filtered out.

To fix, use bms_add_members to merge the parent Var's varnullingrels
into the translated Var's existing set, preserving both sources of
nullability.

Back-patch to v16.  Although the reported case does not seem to cause
problems in v16, leaving incorrect varnullingrels in the tree seems
like a trap for the unwary.

Bug: #19412
Reported-by: Sergey Shinderuk <s.shinderuk@postgrespro.ru>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19412-1d0318089b86859e@postgresql.org
Backpatch-through: 16

Add translator comment

Otherwise the message is not very clear.

Backpatch-through: 18

Update obsolete comment

table_tuple_update's update_indexes argument hasn't been a boolean since
commit 19d8e2308bc5.

Backpatch-through: 16

Fix the volatility setting of json{b}_strip_nulls

Commit 4603903d294 unfortunately reset the volatility of these functions
to STABLE whereas they had previously been set to IMMUTABLE. We can't
force a catalog update in the stable release, although a pg_update would
probably do the trick. A simpler fix, though, for affected users is
probably a simple catalog surgery along the lines of:

UPDATE pg_proc SET provolatile = 'i' WHERE oid in (3261,3262);

Applied to 18 only. In master we are planning to get rid of the separate
redeclarations for defaults in system_functions.sql.

Bug: #19409
Reported-By: Lucio Chiessi <lucio.chiessi@trustly.com>
Discussion: https://postgr.es/m/19409-e16cd2605e59a4af@postgresql.org

Fix test_valid_server_encoding helper function.

Commit c67bef3f325 introduced this test helper function for use by
src/test/regress/sql/encoding.sql, but its logic was incorrect. It
confused an encoding ID for a boolean so it gave the wrong results for
some inputs, and also forgot the usual return macro. The mistake didn't
affect values actually used in the test, so there is no change in
behavior.

Also drop it and another missed function at the end of the test, for
consistency.

Backpatch-through: 14
Author: Zsolt Parragi <zsolt.parragi@percona.com>

Suppress new "may be used uninitialized" warning.

Various buildfarm members, having compilers like gcc 8.5 and 6.3, fail
to deduce that text_substring() variable "E" is initialized if
slice_size!=-1. This suppression approach quiets gcc 8.5; I did not
reproduce the warning elsewhere. Back-patch to v14, like commit
9f4fd119b2cbb9a41ec0c19a8d6ec9b59b92c125.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1157953.1771266105@sss.pgh.pa.us
Backpatch-through: 14

hstore: Fix NULL pointer dereference with receive function

The receive function of hstore was not able to handle correctly
duplicate key values when a new duplicate links to a NULL value, where a
pfree() could be attempted on a NULL pointer, crashing due to a pointer
dereference.

This problem would happen for a COPY BINARY, when stacking values like
that:
aa => 5
aa => null

The second key/value pair is discarded and pfree() calls are attempted
on its key and its value, leading to a pointer dereference for the value
part as the value is NULL. The first key/value pair takes priority when
a duplicate is found.

Per offline report.

Reported-by: "Anemone" <vergissmeinnichtzh@gmail.com>
Reported-by: "A1ex" <alex000young@gmail.com>
Backpatch-through: 14

Don't reset 'latest_page_number' when replaying multixid truncation

'latest_page_number' is set to the correct value, according to
nextOffset, early at system startup. Contrary to the comment, it hence
should be set up correctly by the time we get to WAL replay.

This fixes a failure to replay WAL generated on older minor versions,
before commit 789d65364c (18.2, 17.8, 16.12, 15.16, 14.21). The
failure occurs after a truncation record has been replayed and looks
like this:

    FATAL:  could not access status of transaction 858112
    DETAIL:  Could not read from file "pg_multixact/offsets/000D" at offset 24576: read too few bytes.
    CONTEXT:  WAL redo at 3/2A3AB408 for MultiXact/CREATE_ID: 858111 offset 6695072 nmembers 5: 1048228 (sh) 1048271 (keysh) 1048316 (sh) 1048344 (keysh) 1048370 (sh)

Reported-by: Sebastian Webber <sebastian@swebber.me>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Discussion: https://www.postgresql.org/message-id/20260214090150.GC2297@p46.dedyn.io;lightning.p46.dedyn.io
Backpatch-through: 14-18