git.ipfire.org Git - thirdparty/postgresql.git/log

Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo <guofenglinux@gmail.com>
Author: Antonin Houska <ah@cybertec.at> (in an older version)
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Matheus Alcantara <matheusssilv97@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tomas Vondra <tomas@vondra.me> (in an older version)
Reviewed-by: Andy Fan <zhihuifan1213@163.com> (in an older version)
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com

Allow negative aggtransspace to indicate unbounded state size

This patch reuses the existing aggtransspace in pg_aggregate to
signal that an aggregate's transition state can grow unboundedly. If
aggtransspace is set to a negative value, it now indicates that the
transition state may consume unpredictable or large amounts of memory,
such as in aggregates like array_agg or string_agg that accumulate
input rows.

This information can be used by the planner to avoid applying
memory-sensitive optimizations (e.g., eager aggregation) when there is
a risk of excessive memory usage during partial aggregation.

Bump catalog version.

Per idea from Robert Haas, though applied differently than originally
suggested.

Discussion: https://postgr.es/m/CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com

Improve description of some WAL records for GIN

The following information is added in the description of some GIN
records:
- In INSERT_LISTPAGE, the number of tuples and the right link block.
- In UPDATE_META_PAGE, the number of tuples, the previous tail block,
and the right link block.
- In SPLIT, the left and right children blocks.

Author: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CALdSSPgnAt5L=D_xGXRXLYO5FK1H31_eYEESxdU1n-r4g+6GqA@mail.gmail.com

Add stats_reset to pg_stat_user_functions

It is possible to call pg_stat_reset_single_function_counters() for a
single function, but the reset time was missing the system view showing
its statistics. Like all the fields of pg_stat_user_functions, the GUC
track_functions needs to be enabled to show the statistics about
function executions.

Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to
PgStat_StatFuncEntry.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aONjnsaJSx-nEdfU@paquier.xyz

Fix typo in function header comment.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+TgmoZYh_nw-2j_Fi9y6ZAvrpN+W1aSOFNM7Rus2Q-zTkCsQw@mail.gmail.com

Fix Coverity issues reported in commit 25a30bbd423.

Fix several issues pointed out by Coverity (reported by Tome Lane).

- In row_is_in_frame(), return value of window_gettupleslot() was not
checked.

- WinGetFuncArgInPartition() tried to derefference "isout" pointer
even if it could be NULL in some places.

Besides the issues, I also fixed a compiler warning reported by Álvaro
Herrera.

Moreover, in WinGetFuncArgInPartition refactor the do...while loop so
that the codes inside the loop simpler. Also simplify the case when
abs_pos < 0.

Author: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Paul Ramsey <pramsey@cleverelephant.ca>
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reported-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/1686755.1759679957%40sss.pgh.pa.us
Discussion: https://postgr.es/m/202510051612.gw67jlc2iqpw%40alvherre.pgsql

Cleanup INFINITY code in float.h

The INFINITY macro is always defined per C99 standard, so this should
mean we can now get rid of the workaround code for when that macro isn't
defined.

Also, delete the (now unneeded) #pragma code which was disabling a
compiler warning in MSVC. There was a comment explaining why the #pragma
was placed outside the function body to work around a MSVC compiler bug,
but the link explaining that was dead, as reported by jian he.

Author: David Rowley <dgrowleyml@gmail.com>
Reported-by: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxGARYETnNwtCK7QC0zE_7gq-tfN0mME=gT5rTNtC=VSHQ@mail.gmail.com

Remove PlannerInfo's join_search_private method.

Instead, use the new mechanism that allows planner extensions to store
private state inside a PlannerInfo, treating GEQO as an in-core planner
extension. This is a useful test of the new facility, and also buys
back a few bytes of storage.

To make this work, we must remove innerrel_is_unique_ext's hack of
testing whether join_search_private is set as a proxy for whether
the join search might be retried. Add a flag that extensions can
use to explicitly signal their intentions instead.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com

Allow private state in certain planner data structures.

Extension that make extensive use of planner hooks may want to
coordinate their efforts, for example to avoid duplicate computation,
but that's currently difficult because there's no really good way to
pass data between different hooks.

To make that easier, allow for storage of extension-managed private
state in PlannerGlobal, PlannerInfo, and RelOptInfo, along very
similar lines to what we have permitted for ExplainState since commit
c65bc2e1d14a2d4daed7c1921ac518f2c5ac3d17.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/CA+TgmoYWKHU2hKr62Toyzh-kTDEnMDeLw7gkOOnjL-TnOUq0kQ@mail.gmail.com

Adjust new TAP test to work on macOS.

Seems Apple's version of "wc -l" puts spaces before the number.
(I wonder why the cfbot didn't find this.) While here, make
the failure case log what it got, to aid debugging future issues.

Per buildfarm.

Improve psql's ability to select pager mode accurately.

We try to use the pager only when more than a screenful's worth of
data is to be printed.  However, the code in print.c that's concerned
with counting the number of lines that will be needed missed a lot of
edge cases:

* While plain aligned mode accounted for embedded newlines in column
headers and table cells, unaligned and vertical output modes did not.

* In particular, since vertical mode repeats the headers for each
record, we need to account for embedded newlines in the headers for
each record.

* Multi-line table titles were not accounted for.

* tuples_only mode (where headers aren't printed) wasn't accounted
for.

* Footers were accounted for as one line per footer, again missing
the possibility of multi-line footers.  (In some cases such as
"\d+" on a view, there can be many lines in a footer.)  Also,
we failed to account for the default footer.

To fix, move the entire responsibility for counting lines into
IsPagerNeeded (or actually, into a new subroutine count_table_lines),
and then expand the logic as appropriate.  Also restructure to make it
perhaps a bit easier to follow.  It's still only completely accurate
for ALIGNED/WRAPPED/UNALIGNED formats, but the other formats are not
typically used with interactive output.

Arrange to not run count_table_lines at all unless we will use
its result, and teach it to quit early as soon as it's proven
that the output is long enough to require use of the pager.
When dealing with large tables this should save a noticeable
amount of time, since pg_wcssize() isn't exactly cheap.

In passing, move the "flog" output step to the bottom of printTable(),
rather than running it when we've already opened the pager in some
modes.  In principle it shouldn't interfere with the pager because
flog should always point to a non-interactive file; but it seems silly
to risk any interference, especially when the existing positioning
seems to have been chosen with the aid of a dartboard.

Also add a TAP test to exercise pager mode.  Up to now, we have had
zero test coverage of these code paths, because they aren't reached
unless isatty(stdout).  We do have the test infrastructure to improve
that situation, though.  Following the lead of 010_tab_completion.pl,
set up an interactive psql and feed it some test cases.  To detect
whether it really did invoke the pager, point PSQL_PAGER to "wc -l".
The test is skipped if that utility isn't available.

Author: Erik Wienhold <ewie@ewie.name>
Test-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2dd2430f-dd20-4c89-97fd-242616a3d768@ewie.name

Assign each subquery a unique name prior to planning it.

Previously, subqueries were given names only after they were planned,
which makes it difficult to use information from a previous execution of
the query to guide future planning. If, for example, you knew something
about how you want "InitPlan 2" to be planned, you won't know whether
the subquery you're currently planning will end up being "InitPlan 2"
until after you've finished planning it, by which point it's too late to
use the information that you had.

To fix this, assign each subplan a unique name before we begin planning
it. To improve consistency, use textual names for all subplans, rather
than, as we did previously, a mix of numbers (such as "InitPlan 1") and
names (such as "CTE foo"), and make sure that the same name is never
assigned more than once.

We adopt the somewhat arbitrary convention of using the type of sublink
to set the plan name; for example, a query that previously had two
expression sublinks shown as InitPlan 2 and InitPlan 1 will now end up
named expr_1 and expr_2. Because names are assigned before rather than
after planning, some of the regression test outputs show the numerical
part of the name switching positions: what was previously SubPlan 2 was
actually the first one encountered, but we finished planning it later.

We assign names even to subqueries that aren't shown as such within the
EXPLAIN output. These include subqueries that are a FROM clause item or
a branch of a set operation, rather than something that will be turned
into an InitPlan or SubPlan. The purpose of this is to make sure that,
below the topmost query level, there's always a name for each subquery
that is stable from one planning cycle to the next (assuming no changes
to the query or the database schema).

Author: Robert Haas <rhaas@postgresql.org>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>
Discussion: http://postgr.es/m/3641043.1758751399@sss.pgh.pa.us

doc: Add missing parenthesis in pg_stat_progress_analyze docs

Author: Shinya Kato <shinya11.kato@gmail.com>
Discussion: https://postgr.es/m/CAOzEurRgpAh9dsbEM88FPOhNaV_PkdL6p_9MJatcrNf9wXw1nw@mail.gmail.com
Backpatch-through: 18

Fix compile of src/tutorial/funcs.c

I broke this with recent #include removals. Fix by adding an explicit

Reported-by: Devrim Gündüz <devrim@gunduz.org>
Discussion: https://postgr.es/m/5e2c2d7c44434f3f0af7523864b27fe4fb590902.camel@gunduz.org

Teach planner to short-circuit EXCEPT/INTERSECT with dummy inputs

When either inputs of an INTERSECT [ALL] operator are proven not to return
any results (a dummy rel), then mark the entire INTERSECT operation as
dummy.

Likewise, if an EXCEPT [ALL] operation's left input is proven empty, then
mark the entire operation as dummy.

With EXCEPT ALL, we can easily handle the right input being dummy as
we can return the left input without any processing.  That can lead to
significant performance gains during query execution.  We can't easily
handle dummy right inputs for EXCEPT (without ALL), as that would require
deduplication of the left input.  Wiring up those Paths is likely more
complex than it's worth as the gains during execution aren't that great,
so let's leave that one to be handled by the normal Path generation code.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com

Fix incorrect targetlist in dummy UNIONs

The prior code, added in 03d40e4b5 attempted to use the targetlist of the
first UNION child when all UNION children were proven as dummy rels.
That's not going to work when some operation atop of the Result node must
find target entries within the Result's targetlist.  This could have been
something as simple as trying to sort the results of the UNION operation,
which would lead to:

ERROR:  could not find pathkey item to sort

Instead, use the top-level UNION's targetlist and fix the varnos in
setrefs.c.  Because set operation targetlists always use varno==0, we
can rewrite those to become varno==1, i.e. use the Vars from the first
UNION child.  This does result in showing Vars from relations that are
not present in the final plan, but that's no different to what we see
when normal base relations are proven dummy.

Without this fix it would be possible to see the following error in
EXPLAIN VERBOSE when all UNION inputs were proven empty.

ERROR:  bogus varno: 0

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvrUASy9sfULMEsM2udvZJP6AoBRCZvHYXYxZTy2tX9FYw@mail.gmail.com

Avoid unnecessary GinFormTuple() calls for incompressible posting lists.

Previously, we attempted to form a posting list tuple even when
ginCompressPostingList() failed to compress the posting list due to
its size. While there was no functional failure, it always wasted one
GinFormTuple() call when item pointers didn't fit in a posting list
tuple.

This commit ensures that a GIN index tuple is formed only when all
item pointers in the posting list are successfully compressed.

Author: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CAE7r3M+C=jcpTD93f_RBHrQp3C+=TAXFs+k4tTuZuuxboK8AvA@mail.gmail.com

Optimize hex_encode() and hex_decode() using SIMD.

The hex_encode() and hex_decode() functions serve as the workhorses
for hexadecimal data for bytea's text format conversion functions,
and some workloads are sensitive to their performance. This commit
adds new implementations that use routines from port/simd.h, which
testing indicates are much faster for larger inputs. For small or
invalid inputs, we fall back on the existing scalar versions.
Since we are using port/simd.h, these optimizations apply to both
x86-64 and AArch64.

Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Chiranmoy Bhattacharya <chiranmoy.bhattacharya@fujitsu.com>
Co-authored-by: Susmitha Devanga <devanga.susmitha@fujitsu.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/aLhVWTRy0QPbW2tl%40nathan

Revert "Improve docs syntax checking"

This reverts commit b292256272623d1a7532f3893a4565d1944742b4.

Further discussion is needed

Discussion: https://postgr.es/m/0198ec0f-0269-4cf4-b4a7-22c05b3047cb@eisentraut.org

Expose sequence page LSN via pg_get_sequence_data.

This patch enhances the pg_get_sequence_data function to include the
page-level LSN (Log Sequence Number) of the sequence. This additional
metadata will be used by upcoming patches to support synchronization
of sequences during logical replication.

By exposing the LSN, we enable more accurate tracking of sequence
changes, which is essential for maintaining consistency across
replicated nodes.

Author: vignesh C <vignesh21@gmail.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://www.postgresql.org/message-id/CAA4eK1LC+KJiAkSrpE_NwvNdidw9F2os7GERUeSxSKv71gXysQ@mail.gmail.com

Add comment in ginxlog.h about block used with ginxlogInsertListPage

All the other structures describe the list of blocks used, and in the
case of a GIN_INSERT_LISTPAGE record block 0 refers to a list page with
the items added to it.

Author: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CALdSSPgk=9WRoXhZy5fdk+T1hiau7qbL_vn94w_L1N=gtEdbsg@mail.gmail.com

Remove block information from description of some WAL records for GIN

The WAL records XLOG_GIN_INSERT and XLOG_GIN_VACUUM_DATA_LEAF_PAGE
included some information about the blocks added to the record.

This information is already provided by XLogRecGetBlockRefInfo() with
much more details about the blocks included in each record, like the
compression information, for example. This commit removes the block
information that existed in the record descriptions specific to GIN.

Author: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CALdSSPgk=9WRoXhZy5fdk+T1hiau7qbL_vn94w_L1N=gtEdbsg@mail.gmail.com

Add stats_reset to pg_stat_all_{tables,indexes} and related views

It is possible to call pg_stat_reset_single_table_counters() on a
relation (index or table) but the reset time was missing from the system
views showing their statistics.

This commit adds the reset time as an attribute of pg_stat_all_tables,
pg_stat_all_indexes, and other relations related to them.

Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, as a result of the new field added to
PgStat_StatTabEntry.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aN8l182jKxEq1h9f@paquier.xyz

Add test for pg_stat_reset_single_table_counters() on index

stats.sql is already doing some tests coverage on index statistics, by
retrieving for example idx_scan and friends in pg_stat_all_tables.
pg_stat_reset_single_table_counters() is supported for an index for a
long time, but the case was never covered.

This commit closes the gap, by using this reset function on an index,
cross-checking the contents of pg_stat_all_indexes.

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aN8l182jKxEq1h9f@paquier.xyz

Fix two comments in numeric.c

The comments at the top of numeric_int4_safe() and numeric_int8_safe()
mentioned respectively int4_numeric() and int8_numeric(). The intention
is to refer to numeric_int4() and numeric_int8().

Oversights in 4246a977bad6.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxFfVt7Jx9_j=juxXyP-6tznN8OcvS9E-QSgp0BrD8KUgA@mail.gmail.com

Use SOCK_ERRNO[_SET] in fe-secure-gssapi.c.

On Windows, this code did not handle error conditions correctly at
all, since it looked at "errno" which is not used for socket-related
errors on that platform.  This resulted, for example, in failure
to connect to a PostgreSQL server with GSSAPI enabled.

We have a convention for dealing with this within libpq, which is to
use SOCK_ERRNO and SOCK_ERRNO_SET rather than touching errno directly;
but the GSSAPI code is a relative latecomer and did not get that memo.
(The equivalent backend code continues to use errno, because the
backend does this differently.  Maybe libpq's approach should be
rethought someday.)

Apparently nobody tries to build libpq with GSSAPI support on Windows,
or we'd have heard about this before, because it's been broken all
along.  Back-patch to all supported branches.

Author: Ning Wu <ning94803@gmail.com>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAFGqpvg-pRw=cdsUpKYfwY6D3d-m9tw8WMcAEE7HHWfm-oYWvw@mail.gmail.com
Backpatch-through: 13

Don't include access/htup_details.h in executor/tuptable.h

This is not at all needed; I suspect it was a simple mistake in commit
5408e233f066. It causes htup_details.h to bleed into a huge number of
places via execnodes.h. Remove it and fix fallout.

Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql

Don't include execnodes.h in brin.h or gin.h

These headers don't need execnodes.h for anything. I think they never
have.

Discussion: https://postgr.es/m/202510021240.ptc2zl5cvwen@alvherre.pgsql

Teach UNION planner to remove dummy inputs

This adjusts UNION planning so that the planner produces more optimal
plans when one or more of the UNION's subqueries have been proven to be
empty (a dummy rel).

If any of the inputs are empty, then that input can be removed from the
Append / MergeAppend.  Previously, a const-false "Result" node would
appear to represent this.  Removing empty inputs has a few extra
benefits when only 1 union child remains as it means the Append or
MergeAppend can be removed in setrefs.c, making the plan slightly faster
to execute.  Also, we can provide better n_distinct estimates by looking
at the sole remaining input rel's statistics.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAApHDvri53PPF76c3M94_QNWbJfXjyCnjXuj_2=LYM-0m8WZtw@mail.gmail.com

Use bms_add_members() instead of bms_union() when possible

bms_union() causes a new set to be allocated.  What this caller needs is
members added to an existing set.  bms_add_members() is the tool for
that job.

This is just a matter of fixing an inefficiency due to surplus memory
allocations.  No bugs being fixed.

The only other place I found that might be valid to apply this change is
in markNullableIfNeeded(), but I opted not to do that due to the risk to
reward ratio not looking favorable.  The risk being that there *could* be
another pointer pointing to the Bitmapset.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAApHDvoCcoS-p5tZNJLTxFOKTYNjqVh7Dwf+5ikDUBwnvWftRw@mail.gmail.com

Optimize vector8_has_le() on AArch64.

Presently, the SIMD implementation of this function uses unsigned
saturating subtraction to find bytes less than or equal to the
given value, which is a workaround for the lack of unsigned
comparison instructions on some architectures. However, Neon
offers vminvq_u8(), which returns the minimum (unsigned) value in
the vector. This commit adds a Neon-specific implementation that
uses vminvq_u8() to optimize vector8_has_le() on AArch64.

In passing, adjust the SSE2 implementation to use vector8_min() and
vector8_eq() to find values less than or equal to the given value.
This was the only use of vector8_ssub(), so it has been removed.

Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/aNHDNDSHleq0ogC_%40nathan

Make some use of anonymous unions [DSM registry].

Make some use of anonymous unions, which are allowed as of C11, as
examples and encouragement for future code, and to test compilers.

This commit changes the DSMRegistryEntry struct.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/aNKsDg0fJwqhZdXX%40nathan

Tidy-up unneeded NULL parameter checks from SQL function

This function is marked as strict, so we can safely remove the checks
checking for NULL input parameters.

Author: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/CAApHDvqiN0+mbooUOSCDALc=GoM8DmTbCdvwnCwak6Wb2O1ZJA@mail.gmail.com

Fix reuse-after-free hazard in dead_items_reset

In similar vein to commit ccc8194e427, a reset instance of a shared
memory TID store happened to occupy the same private memory as the old
one for the entry point, since the chunk freed after the last round
of index vacuuming was put on the context's freelist. The failure
to update the vacrel->dead_items pointer was evident by nudging the
system to allocate memory in a different area. This was not discovered
at the time of the earlier commit since our regression tests didn't
cover multiple index passes with parallel vacuum.

Backpatch to v17, when TidStore came in.

Author: Kevin Oommen Anish <kevin.o@zohocorp.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Tested-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/199a07cbdfc.7a1c4aac25838.1675074408277594551%40zohocorp.com
Backpatch-through: 17

Fix incorrect function reference in comment

The comment incorrectly references the defunct function
BufFileOpenShared(), which was replaced in commit dcac5e7ac.

This patch updates the comment to refer to the current function
BufFileOpenFileSet().

Author: Zhang Mingli <zmlpostgres@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Discussion: https://postgr.es/m/1cb48b4c-54ab-40cc-b355-0b3c2af6d3f7@Spark

pgbench: Fail cleanly when finding a COPY result state

Currently, pgbench aborts when a COPY response is received in
readCommandResponse().  However, as PQgetResult() returns an empty
result when there is no asynchronous result, through getCopyResult(),
the logic done at the end of readCommandResponse() for the error path
leads to an infinite loop.

This commit forcefully exits the COPY state with PQendcopy() before
moving to the error handler when fiding a COPY state, avoiding the
infinite loop.  The COPY protocol is not supported by pgbench anyway, as
an error is assumed in this case, so giving up is better than having the
tool be stuck forever.  pgbench was interruptible in this state.

A TAP test is added to check that an error happens if trying to use
COPY.

Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqpHyF2m73ifV5a=5jhXxH2chk=XrgefY+eWWPe2Eft3=A@mail.gmail.com
Backpatch-through: 13

Add IGNORE NULLS/RESPECT NULLS option to Window functions.

Add IGNORE NULLS/RESPECT NULLS option (null treatment clause) to lead,
lag, first_value, last_value and nth_value window functions. If
unspecified, the default is RESPECT NULLS which includes NULL values
in any result calculation. IGNORE NULLS ignores NULL values.

Built-in window functions are modified to call new API
WinCheckAndInitializeNullTreatment() to indicate whether they accept
IGNORE NULLS/RESPECT NULLS option or not (the API can be called by
user defined window functions as well). If WinGetFuncArgInPartition's
allowNullTreatment argument is true and IGNORE NULLS option is given,
WinGetFuncArgInPartition() or WinGetFuncArgInFrame() will return
evaluated function's argument expression on specified non NULL row (if
it exists) in the partition or the frame.

When IGNORE NULLS option is given, window functions need to visit and
evaluate same rows over and over again to look for non null rows. To
mitigate the issue, 2-bit not null information array is created while
executing window functions to remember whether the row has been
already evaluated to NULL or NOT NULL. If already evaluated, we could
skip the evaluation work, thus we could get better performance.

Author: Oliver Ford <ojford@gmail.com>
Co-authored-by: Tatsuo Ishii <ishii@postgresql.org>
Reviewed-by: Krasiyan Andreev <krasiyan@gmail.com>
Reviewed-by: Andrew Gierth <andrew@tao11.riddles.org.uk>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: David Fetter <david@fetter.org>
Reviewed-by: Vik Fearing <vik@postgresfriends.org>
Reviewed-by: "David G. Johnston" <david.g.johnston@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/flat/CAGMVOdsbtRwE_4+v8zjH1d9xfovDeQAGLkP_B6k69_VoFEgX-A@mail.gmail.com

Remove check for NULL in STRICT function

test_bms_make_singleton is defined as STRICT and only takes a single
parameter, so there is no need to check that parameter for NULL as a
NULL input won't ever reach there.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/BC483901-9587-4076-B20F-9A414C66AB78@yesql.se

Fixes for comments in test_bitmapset

This fixes a typo in the sql/expected test files and removes a
leftover comment from test_bitmapset.c from when the functions
invoked bms_free.

Author: Daniel Gustafsson <daniel@yesql.se>
Reported-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/978D21E8-9D3B-40EA-A4B1-F87BABE7868C@yesql.se

Improve docs syntax checking

Move the checks out of the Makefile into a perl script that can be
called from both the Makefile and meson.build. The set of files checked
is simplified, so it is just all the sgml and xsl files found in
docs/src/sgml directory tree.

Along the way make some adjustments to .cirrus.tasks.yml to support this
better in CI.

Also ensure that the checks are part of the Makefile's html target.

Author: Nazir Bilal Yavuz <byavuz81@gmail.com>
Co-Author: Andrew Dunstan <andrew@dunslane.net>

Discussion: https://postgr.es/m/CAN55FZ3BnM+0twT-ZWL8As9oBEte_b+SBU==cz6Hk8JUCM_5Wg@mail.gmail.com

doc: Improve wording for base64url definition

This sentence should be "the alphabet uses" due to it referring to
multiple cases of use.

Reported-by: Erik Rijkers <er@xs4all.nl>
Discussion: https://postgr.es/m/81d6ab37-92dc-75c9-a649-4e1286a343ea@xs4all.nl

Remove useless pointer update in ginxlog.c

Oversight in 2c03216d8311, when the redo code of GIN got refactored for
the new WAL format where block information has been standardized, as the
payload data got tracked for each block after the change, and not in the
whole record. This is just a cleanup.

Author: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Discussion: https://postgr.es/m/CALdSSPgnAt5L=D_xGXRXLYO5FK1H31_eYEESxdU1n-r4g+6GqA@mail.gmail.com

Generate EUC_CN mappings from gb18030-2022.ucm

In the wake of cfa6cd292, EUC_CN was the only encoding that used
gb-18030-2000.xml to generate the .map files. Since EUC_CN is a subset
of GB18030, we can easily use the same UCM file. This allows deleting
the XML file from our repository.

Author: Chao Li <lic@highgo.com>
Discussion: https://postgr.es/m/CANWCAZaNRXZ-5NuXmsaMA2mKvMZnCGHZqQusLkpE%2B8YX%2Bi5OYg%40mail.gmail.com

pgstattuple: Improve reports generated for indexes (hash, gist, btree)

pgstattuple checks the state of the pages retrieved for gist and hash
using some check functions from each index AM, respectively
gistcheckpage() and _hash_checkpage().  When these are called, they
would fail when bumping on data that is found as incorrect (like opaque
area size not matching, or empty pages), contrary to btree that simply
discards these cases and continues to aggregate data.

Zero pages can happen after a crash, with these AMs being able to do an
internal cleanup when these are seen.  Also, sporadic failures are
annoying when doing for example a large-scale diagnostic query based on
pgstattuple with a join of pg_class, as it forces one to use tricks like
quals to discard hash or gist indexes, or use a PL wrapper able to catch
errors.

This commit changes the reports generated for btree, gist and hash to
be more user-friendly;
- When seeing an empty page, report it as free space.  This new rule
applies to gist and hash, and already applied to btree.
- For btree, a check based on the size of BTPageOpaqueData is added.
- For gist indexes, gistcheckpage() is not called anymore, replaced by a
check based on the size of GISTPageOpaqueData.
- For hash indexes, instead of _hash_getbuf_with_strategy(), use a
direct call to ReadBufferExtended(), coupled with a check based on
HashPageOpaqueData.  The opaque area size check was already used.
- Pages that do not match these criterias are discarded from the stats
reports generated.

There have been a couple of bug reports over the years that complained
about the current behavior for hash and gist, as being not that useful,
with nothing being done about it.  Hence this change is backpatched down
to v13.

Reported-by: Noah Misch <noah@leadboat.com>
Author: Nitin Motiani <nitinmotiani@google.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/CAH5HC95gT1J3dRYK4qEnaywG8RqjbwDdt04wuj8p39R=HukayA@mail.gmail.com
Backpatch-through: 13

test_json_parser: Speed up 002_inline.pl

Some macOS machines are having trouble with 002_inline, which executes
the JSON parser test executables hundreds of times in a nested loop.
Both developer machines and buildfarm critters have shown excessive test
durations, upwards of 20 seconds.

Push the innermost loop of 002_inline, which iterates through differing
chunk sizes, down into the test executable. (I'd eventually like to push
all of the JSON unit tests down into C, but this is an easy win in the
short term.) Testers have reported a speedup between 4-9x.

Reported-by: Robert Haas <robertmhaas@gmail.com>
Suggested-by: Andres Freund <andres@anarazel.de>
Tested-by: Andrew Dunstan <andrew@dunslane.net>
Tested-by: Tom Lane <tgl@sss.pgh.pa.us>
Tested-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA%2BTgmobKoG%2BgKzH9qB7uE4MFo-z1hn7UngqAe9b0UqNbn3_XGQ%40mail.gmail.com
Backpatch-through: 17

Fix compiler warnings around _CRT_glob

Newer compilers warned about

    extern int _CRT_glob = 0;

which is indeed a mysterious C construction, as it combines "extern"
and an initialization.  It turns out that according to the C standard,
the "extern" is ignored here, so we can remove it to resolve the
warnings.  But then we also need to add a real extern
declaration (without initializer) to satisfy
-Wmissing-variable-declarations.

(Note that this code is only active on MinGW.)

Discussion: https://www.postgresql.org/message-id/1053279b-da01-4eb4-b7a3-da6b5d8f73d1%40eisentraut.org

Minor fixups of test_bitmapset.c

The macro's comment had become outdated from a prior version and there's
now no longer a need for the do/while loop (or my misplaced semi-colon).

Author: David Rowley <dgrowleyml@gmail.com>
Discussion: https://postgr.es/m/CAApHDvr+P454SP_LDvB=bViPAbDQhV1Jmg5M55GEKz0d3z25NA@mail.gmail.com

test_bitmapset: Simplify code of the module

Two macros are added in this module, to cut duplicated patterns:
- PG_ARG_GETBITMAPSET(), for input argument handling, with knowledge
about NULL.
- PG_RETURN_BITMAPSET_AS_TEXT(), that generates a text result from a
Bitmapset.

These changes limit the code so as the SQL functions are now mostly
wrappers of the equivalent C function.  Functions that use integer input
arguments still need some NULL handling, like bms_make_singleton().

A NULL input is translated to "<>", which is what nodeToString()
generates.  Some of the tests are able to generate this result.

Per discussion, the calls of bms_free() are removed.  These may be
justified if the functions are used in a rather long-lived memory
context, but let's keep the code minimal for now.  These calls used NULL
checks, which were also not necessary as NULL is an input authorized by
bms_free().

Some of the tests existed to cover behaviors related to the SQL
functions for NULL inputs.  Most of them are still relevant, as the
routines of bitmapset.c are able to handle such cases.

The coverage reports of bitmapset.c and test_bitmapset.c remain the
same after these changes, with 300 lines of C code removed.

Author: David Rowley <dgrowleyml@gmail.com>
Co-authored-by: Greg Burd <greg@burd.me>
Discussion: https://postgr.es/m/CAApHDvqghMnm_zgSNefto9oaEJ0S-3Cgb3gdsV7XvLC-hMS02Q@mail.gmail.com

Rename pg_builtin_integer_constant_p to pg_integer_constant_p

Since it's not builtin. Also fix a related typo.

Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAApHDvom02B_XNVSkvxznVUyZbjGAR%2B5myA89ZcbEd3%3DPA9UcA%40mail.gmail.com

pgbench: Fix error reporting in readCommandResponse().

pgbench uses readCommandResponse() to process server responses.
When readCommandResponse() encounters an error during a call to
PQgetResult() to fetch the current result, it attempts to report it
with an additional error message from PQerrorMessage(). However,
previously, this extra error message could be lost or become incorrect.

The cause was that after fetching the current result (and detecting
an error), readCommandResponse() called PQgetResult() again to
peek at the next result. This second call could overwrite the libpq
connection's error message before the original error was reported,
causing the error message retrieved from PQerrorMessage() to be
lost or overwritten.

This commit fixes the issue by updating readCommandResponse()
to use PQresultErrorMessage() instead of PQerrorMessage()
to retrieve the error message generated when the PQgetResult()
for the current result causes an error, ensuring the correct message
is reported.

Backpatch to all supported versions.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20250925110940.ebacc31725758ec47d5432c6@sraoss.co.jp
Backpatch-through: 13

Fix typo in pgstat_relation.c header comment

Looks like a copy and paste error from pgstat_function.c

Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/aNuaVMAdTGbgBgqh@ip-10-97-1-34.eu-west-3.compute.internal

Revert "Make some use of anonymous unions [pgcrypto]"

This reverts commit efcd5199d8cb8e5098f79b38d0c46004e69d1a46.

I rebased my patch series incorrectly. This patch contained unrelated
parts from another patch, which made the overall build fail. Revert
for now and reconsider.

Make some use of anonymous unions [plpython]

Make some use of anonymous unions, which are allowed as of C11, as
examples and encouragement for future code, and to test compilers.

This commit changes some structures in plpython.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org

Make some use of anonymous unions [pgcrypto]

Make some use of anonymous unions, which are allowed as of C11, as
examples and encouragement for future code, and to test compilers.

This commit changes some structures in pgcrypto.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org

Make some use of anonymous unions [reorderbuffer xact_time]

Make some use of anonymous unions, which are allowed as of C11, as
examples and encouragement for future code, and to test compilers.

This commit changes the ReorderBufferTXN struct.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org

Make some use of anonymous unions [pg_locale_t]

Make some use of anonymous unions, which are allowed as of C11, as
examples and encouragement for future code, and to test compilers.

This commit changes the pg_locale_t type.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/f00a9968-388e-4f8c-b5ef-5102e962d997%40eisentraut.org

Do a tiny bit of header file maintenance

Stop including utils/relcache.h in access/genam.h, and stop including
htup_details.h in nodes/tidbitmap.h. Both these files (genam.h and
tidbitmap.h) are widely used in other header files, so it's in our best
interest that they remain as lean as reasonable.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/202509291356.o5t6ny2hoa3q@alvherre.pgsql

Reorder XLogNeedsFlush() checks to be more consistent

During recovery, XLogNeedsFlush() checks the minimum recovery LSN point
instead of the flush LSN point.  The same condition checks are used when
updating the minimum recovery point in UpdateMinRecoveryPoint(), but are
written in reverse order.

This commit makes the order of the checks consistent between
XLogNeedsFlush() and UpdateMinRecoveryPoint(), improving the code
clarity.  Note that the second check (as ordered by this commit) relies
on InRecovery, which is true only in the startup process.  So this makes
XLogNeedsFlush() cheaper in the startup process with the first check
acting as a shortcut while doing crash recovery, where
LocalMinRecoveryPoint is an invalid LSN.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Discussion: https://postgr.es/m/aMIHNRTP6Wj6vw1s%40paquier.xyz

injection_points: Add proper locking when reporting fixed-variable stats

Contrary to its siblings for the archiver, the bgwriter and the
checkpointer stats, pgstat_report_inj_fixed() can be called
concurrently. This was causing an assertion failure, while messing up
with the stats.

This code is aimed at being a template for extension developers, so it
is not a critical issue, but let's be correct. This module has also
been useful for some benchmarking, at least for me, and that was how I
have discovered this issue.

Oversight in f68cd847fa40.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/aNnXbAXHPFUWPIz2@paquier.xyz
Backpatch-through: 18

Add GROUP BY ALL.

GROUP BY ALL is a form of GROUP BY that adds any TargetExpr that does
not contain an aggregate or window function into the groupClause of
the query, making it exactly equivalent to specifying those same
expressions in an explicit GROUP BY list.

This feature is useful for certain kinds of data exploration.  It's
already present in some other DBMSes, and the SQL committee recently
accepted it into the standard, so we can be reasonably confident in
the syntax being stable.  We do have to invent part of the semantics,
as the standard doesn't allow for expressions in GROUP BY, so they
haven't specified what to do with window functions.  We assume that
those should be treated like aggregates, i.e., left out of the
constructed GROUP BY list.

In passing, wordsmith some existing documentation about GROUP BY,
and update some neglected synopsis entries in select_into.sgml.

Author: David Christensen <david@pgguru.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHM0NXjz0kDwtzoe-fnHAqPB1qA8_VJN0XAmCgUZ+iPnvP5LbA@mail.gmail.com

Remove unused parameter from find_window_run_conditions()

... and check_and_push_window_quals().

Similar to 4be9024d5, but it seems there was yet another unused
parameter.

Author: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/DD5BEKORUG34.2M8492NMB9DB8@gmail.com

Fix StatisticsObjIsVisibleExt() for pg_temp.

Neighbor get_statistics_object_oid() ignores objects in pg_temp, as has
been the standard for non-relation, non-type namespace searches since
CVE-2007-2138.  Hence, most operations that name a statistics object
correctly decline to map an unqualified name to a statistics object in
pg_temp.  StatisticsObjIsVisibleExt() did not.  Consequently,
pg_statistics_obj_is_visible() wrongly returned true for such objects,
psql \dX wrongly listed them, and getObjectDescription()-based ereport()
and pg_describe_object() wrongly omitted namespace qualification.  Any
malfunction beyond that would depend on how a human or application acts
on those wrong indications.  Commit
d99d58cdc8c0b5b50ee92995e8575c100b1a458a introduced this.  Back-patch to
v13 (all supported versions).

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/20250920162116.2e.nmisch@google.com
Backpatch-through: 13

test_bitmapset: Expand more the test coverage

This commit expands the set of tests added by 00c3d87a5cab, to bring the
coverage of bitmapset.c close to 100% by addressing a lot of corner
cases (most of these relate to word counts and reallocations).

Some of the functions of this module also have their own idea of the
result to return depending on the input values given. These are
specific to the module, still let's add more coverage for all of them.

Some comments are made more consistent in the tests, while on it.

Author: Greg Burd <greg@burd.me>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/aNR-gsGmLnMaNT5i@paquier.xyz

Improve planner's width estimates for set operations

For UNION, EXCEPT and INTERSECT, we were not very good at estimating the
PathTarget.width for the set operation.  Since the targetlist of the set
operation is made up of Vars with varno==0, this would result in
get_expr_width() applying a default estimate based on the Var's type
rather than taking width estimates from any relation's statistics.

Here we attempt to improve the situation by looking at the width estimates
for the set operation child paths and calculating the average width of the
relevant child paths weighted over the estimated number of rows.  For
UNION and INTERSECT, the relevant paths to look at are *all* child paths.
For EXCEPT, since we don't return rows from the right-hand child (only
possibly remove left-hand rows matching those), we use only the left-hand
child for width estimates.

This also adjusts the hashed-UNION Path's PathTarget to use the same
PathTarget as its Append subpath.  Both PathTargets will be the same and
are void of any resjunk columns, per generate_append_tlist().  Making
the AggPath use the same PathTarget saves having to adjust the "width"
of the AggPath's PathTarget too.

This was reported as a bug by sunw.fnst, but it's not something we ever
claimed to do properly.  Plus, if we were to adjust this in back
branches, plans could change as the estimated input sizes to Sorts and
Hash Aggregates could go up or down.  Plan choices aren't something we
want to destabilize in stable versions.

Reported-by: sunw.fnst <936739278@qq.com>
Author: David Rowley <drowleyml@gmail.com>
Discussion: https://postgr.es/m/tencent_34CF8017AB81944A4C08DD089D410AB6C306@qq.com

injection_points: Enable entry count in its variable-sized stats

This serves as coverage for the tracking of entry count added by
7bd2975fa92b as built-in variable-sized stats kinds have no need for it,
at least not yet.

A new function, called injection_points_stats_count(), is added to the
module. It is able to return the number of entries. This has been
useful when doing some benchmarking to check the sanity of the counts.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aMPKWR81KT5UXvEr@paquier.xyz

Add support for tracking of entry count in pgstats

Stats kinds can set a new option called "track_entry_count" (disabled by
default, available for variable-numbered stats) that will make pgstats
track the number of entries that exist in its shared hashtable.

As there is only one code path where a new entry is added, and one code
path where entries are freed, the count tracking is straight-forward in
its implementation.  Reads of these counters are optimistic, and may
change across two calls.  The counter is incremented when an entry is
created (not when reused), and is decremented when an entry is freed
from the hashtable (marked for drop with its refcount reaching 0), which
is something that pgstats decides internally.

A first use case of this facility would be pg_stat_statements, where we
need to be able to cap the number of entries that would be stored in the
shared hashtable, based on its "max" GUC.  The module currently relies
on hash_get_num_entries(), which offers a cheap way to count how many
entries are in its hash table, but we cannot do that in pgstats for
variable-sized stats kinds as a single hashtable is used for all the
stats kinds.  Independently of PGSS, this is useful for other custom
stats kinds that want to cap, control, or track the number of entries
they have, without depending on a potentially expensive sequential scan
to know the number of entries while holding an extra exclusive lock.

Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Keisuke Kuroda <keisuke.kuroda.3862@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Discussion: https://postgr.es/m/aMPKWR81KT5UXvEr@paquier.xyz

Refactor to avoid code duplication in transformPLAssignStmt.

transformPLAssignStmt contained many lines cribbed directly from
transformSelectStmt.  I had supposed that we could manage to keep
the two copies in sync, but the bug just fixed in 7504d2be9 shows
that that hope was foolish.  Let's refactor so there's just one copy.

The main stumbling block to doing this is that transformPLAssignStmt
has a chunk of custom code that has to run after transformTargetList
but before we potentially modify the tlist further during analysis
of ORDER BY and GROUP BY.  Rather than make transformSelectStmt fully
aware of PLAssignStmt processing, I put that code into a callback
function.  It still feels a little bit ugly, but it's not too awful,
and surely it's better than a hundred lines of duplicated code.
The steps involved in processing a PLAssignStmt remain exactly
the same as before, just in different places.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/31027.1758919078@sss.pgh.pa.us

Fix missed copying of groupDistinct in transformPLAssignStmt.

Because we failed to do this, DISTINCT in GROUP BY DISTINCT would be
ignored in PL/pgSQL assignment statements.  It's not surprising that
no one noticed, since such statements will throw an error if the query
produces more than one row.  That eliminates most scenarios where
advanced forms of GROUP BY could be useful, and indeed makes it hard
even to find a simple test case.  Nonetheless it's wrong.

This is directly the fault of be45be9c3 which added the groupDistinct
field, but I think much of the blame has to fall on c9d529848, in
which I incautiously supposed that we'd manage to keep two copies of
a big chunk of parse-analysis logic in sync.  As a follow-up, I plan
to refactor so that there's only one copy.  But that seems useful
only in master, so let's use this one-line fix for the back branches.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/31027.1758919078@sss.pgh.pa.us
Backpatch-through: 14

Teach MSVC that elog/ereport ERROR doesn't return

It had always been intended that this already works correctly as
pg_unreachable() uses __assume(0) on MSVC, and that directs the compiler
in a way so it knows that a given function won't return.  However, with
ereport_domain(), it didn't work...

It's now understood that the failure to determine that elog(ERROR) does not
return comes from the inability of the MSVC compiler to detect the "const
int elevel_" is the same as the "elevel" macro parameter.  MSVC seems to be
unable to make the "if (elevel_ >= ERROR) branch constantly-true when the
macro is used with any elevel >= ERROR, therefore the pg_unreachable() is
seen to only be present in a *conditional* branch rather than present
*unconditionally*.

While there seems to be no way to force the compiler into knowing that
elevel_ is equal to elevel within the ereport_domain() macro, there is a
way in C11 to determine if the elevel parameter is a compile-time
constant or not.  This is done via some hackery using the _Generic()
intrinsic function, which gives us functionality similar to GCC's
__builtin_constant_p(), albeit only for integers.

Here we define pg_builtin_integer_constant_p() for this purpose.
Callers can check for availability via HAVE_PG_BUILTIN_INTEGER_CONSTANT_P.
ereport_domain() has been adjusted to use
pg_builtin_integer_constant_p() instead of __builtin_constant_p().

It's not quite clear at this stage if this now allows us to forego doing
the likes of "return NULL; /* keep compiler quiet */" as there may be other
compilers in use that have similar struggles.  It's just a matter of time
before someone commits a function that does not "return" a value after
an elog(ERROR).  Let's make time and lack of complaints about said commit
be the judge of if we need to continue the "/* keep compiler quiet */"
palaver.

Author: David Rowley <drowleyml@gmail.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CAApHDvom02B_XNVSkvxznVUyZbjGAR+5myA89ZcbEd3=PA9UcA@mail.gmail.com

Remove unused for_all_tables field from AlterPublicationStmt.

No backpatch as AlterPublicationStmt struct is exposed.

Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/CAD21AoC6B6AuxWOST-TkxUbDgp8FwX=BLEJZmKLG_VrH-hfxpA@mail.gmail.com

Split vacuumdb to create vacuuming.c/h

This allows these routines to be reused by a future utility heavily
based on vacuumdb.

I made a few relatively minor changes from the original, most notably:

- objfilter was declared as an enum but the values are bit-or'ed, and
  individual bits are tested throughout the code.  We've discussed this
  coding pattern in other contexts and stayed away from it, on the
  grounds that the values so generated aren't really true values of the
  enum.  This commit changes it to be a bits32 with a few #defines for
  the flag definitions, the way we do elsewhere.  Also, instead of being
  a global variable, it's now in the vacuumingOptions struct.

- Two booleans, analyze_only (in vacuumingOptions) and analyze_in_stages
  (passed around as a separate boolean argument), are really determining
  what "mode" the program runs in -- it's either vacuum, or one of those
  two modes.  I have three adjectives for them: inconsistent,
  unergonomic, unorthodox.  Removing these and replacing them with a
  mode enum to be kept in vacuumingOptions makes the code structure easier
  to understand in a couple of places, and it'll be useful for the new
  mode we add next, so do that.

Reviewed-by: Antonin Houska <ah@cybertec.at>
Discussion: https://postgr.es/m/202508301750.cbohxyy2pcce@alvherre.pgsql

Create a separate file listing backend types

Use our established coding pattern to reduce maintenance pain when
adding other per-process-type characteristics.

Like PG_KEYWORD, PG_CMDTAG, PG_RMGR.

To keep the strings translatable, the relevant makefile now also scans
src/include for this specific file. I didn't want to have it scan all
.h files, as then gettext would have to scan all header files. I didn't
find any way to affect the meson behavior in this respect though.

Author: Álvaro Herrera <alvherre@kurilemu.de>
Co-authored-by: Jonathan Gonzalez V. <jonathan.abdiel@gmail.com>
Discussion: https://postgr.es/m/202507151830.dwgz5nmmqtdy@alvherre.pgsql

pgbench: Fix assertion failure with retriable errors in pipeline mode.

When running pgbench with --verbose-errors option and a custom script that
triggered retriable errors (e.g., serialization errors) in pipeline mode,
an assertion failure could occur:

Assertion failed: (sql_script[st->use_file].commands[st->command]->type == 1), function commandError, file pgbench.c, line 3062.

The failure happened because pgbench assumed these errors would only occur
during SQL commands, but in pipeline mode they can also happen during
\endpipeline meta command.

This commit fixes the assertion failure by adjusting the assertion check to
allow such errors during either SQL commands or \endpipeline.

Backpatch to v15, where the assertion check was introduced.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwGWQMOzNkQs-LmpDHdNC0h8dmAuUMRvZrEntQi5a-b=Kg@mail.gmail.com

Improve stability of btree page split on ERRORs

This improves the stability of VACUUM when processing btree indexes,
which was previously able to trigger an assertion failure in
_bt_lock_subtree_parent() when an error was previously thrown outside
the scope of _bt_split() when splitting a btree page.  VACUUM would
consider the index as in a corrupted state as the right page would not
be zeroed for the error thrown (allocation failure is one pattern).

In a non-assert build, VACUUM is able to succeed, reporting what it sees
as a corruption while attempting to fix the index.  This would manifest
as a LOG message, as of:
LOG: failed to re-find parent key in index "idx" for deletion target
page N
CONTEXT:  while vacuuming index "idx" of relation "public.tab"

This commit improves the code to rely on two PGAlignedBlocks that are
used as a temporary space for the left and right pages.  The main change
concerns the right page, whose contents are now copied into the
"temporary" PGAlignedBlock page while its original space is zeroed.  Its
contents are moved from the PGAlignedBlock page back to the page once we
enter in the critical section used for the split.  This simplifies the
split logic, as it is not necessary to zero the right page before
throwing an error anymore.  Hence errors can now be thrown outside the
split code.  For the left page, this shaves one allocation, with
PageGetTempPage() being previously used.

The previous logic originates from commit 8fa30f906b, at a point where
PGAlignedBlock did not exist yet.  This could be argued as something
that should be backpatched, but the lack of complaints indicates that it
may not be necessary.

Author: Konstantin Knizhnik <knizhnik@garret.ru>
Discussion: https://postgr.es/m/566dacaf-5751-47e4-abc6-73de17a5d42a@garret.ru

Fix misleading comment in pg_get_statisticsobjdef_string()

The comment claimed that a TABLESPACE reference was added to the
resulting string, but that's not true. Looks like the comment was
copied from pg_get_indexdef_string() without being adjusted correctly.

Reported-by: jian he <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxHwVPgeu8o9D8oUeDQYEHTAZGt-J5uaJNgYMzkAW7MiCA@mail.gmail.com

Remove unused parameter from check_and_push_window_quals

... and find_window_run_conditions.

This seems to have been around and unused ever since the Run Condition
feature was added in 9d9c02ccd. Let's remove it to clean things up a
bit.

Author: Matheus Alcantara <matheusssilv97@gmail.com>
Discussion: https://postgr.es/m/DD26NJ0Y34ZS.2ZOJPHSY12PFI@gmail.com

psql: Add COMPLETE_WITH_FILES and COMPLETE_WITH_GENERATOR macros.

While most tab completions in match_previous_words() use
COMPLETE_WITH* macros to wrap rl_completion_matches(), some direct
calls to rl_completion_matches() still remained.

This commit introduces COMPLETE_WITH_FILES and COMPLETE_WITH_GENERATOR
macros to replace these direct calls, enhancing both code consistency
and readability.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/20250605100835.b396f9d656df1018f65a4556@sraoss.co.jp

Try to avoid floating-point roundoff error in pg_sleep().

I noticed the surprising behavior that pg_sleep(0.001) will sleep
for 2ms not the expected 1ms.  Apparently the float8 calculation of
time-to-sleep is managing to produce something a hair over 1, which
ceil() rounds up to 2, and then WaitLatch() faithfully waits 2ms.
It could be that this works as-expected for some ranges of current
timestamp but not others, which would account for not having seen
it before.  In any case, let's try to avoid it by removing the
float arithmetic in the delay calculation.  We're stuck with the
declared input type being float8, but we can convert that to integer
microseconds right away, and then work strictly with integral values.
There might still be roundoff surprises for certain input values,
but at least the behavior won't be time-varying.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/3879137.1758825752@sss.pgh.pa.us

Add minimal sleep to stats isolation test functions.

The functions test_stat_func() and test_stat_func2() had empty
function bodies, so that they took very little time to run.  This made
it possible that on machines with relatively low timer resolution the
functions could return before the clock advanced, making the test fail
(as seen on buildfarm members fruitcrow and hamerkop).

To avoid that, pg_sleep for 10us during the functions.  As far as we
can tell, all current hardware has clock resolution much less than
that.  (The current implementation of pg_sleep will round it up to
1ms anyway, but someday that might get improved.)

Author: Michael Banck <mbanck@gmx.net>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/68d413a3.a70a0220.24c74c.8be9@mx.google.com
Backpatch-through: 15

Fix array allocation bugs in SetExplainExtensionState.

If we already have an extension_state array but see a new extension_id
much larger than the highest the extension_id we've previously seen,
the old code might have failed to expand the array to a large enough
size, leading to disaster. Also, if we don't have an extension array
at all and need to create one, we should make sure that it's big enough
that we don't have to resize it instantly.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: http://postgr.es/m/2949591.1758570711@sss.pgh.pa.us
Backpatch-through: 18

Doc: clean up documentation for new UUID functions.

Fix assorted failures to conform to our normal style for function
documentation, such as lack of parentheses and incorrect markup.

Author: Marcos Pegoraro <marcos@f10.com.br>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAB-JLwbocrFjKfGHoKY43pHTf49Ca2O0j3WVebC8z-eQBMPJyw@mail.gmail.com
Backpatch-through: 18

Teach doc/src/sgml/Makefile about the new func/*.sgml files.

These were omitted from build dependencies and also tab/nbsp
checks, with the result that "make" did nothing after modifying
a func/*.sgml file.

Oversight in 4e23c9ef6. AFAICT we don't need any comparable
changes in meson.build, or at least I don't see it doing anything
special for the pre-existing ref/*.sgml files.

Remove preprocessor guards from injection points

When defining an injection point there is no need to wrap the definition
with USE_INJECTION_POINT guards, the INJECTION_POINT macro is available
in all builds. Remove to make the code consistent.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/OSCPR01MB14966C8015DEB05ABEF2CE077F51FA@OSCPR01MB14966.jpnprd01.prod.outlook.com
Backpatch-through: 17

Fix comments in recovery tests

Commit 4464fddf removed the large insertions but missed to remove
all the comments referring to them. Also remove a superfluous ')'
in another comment.

Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/OSCPR01MB149663A99DAF2826BE691C23DF51FA@OSCPR01MB14966.jpnprd01.prod.outlook.com

Don't include execnodes.h in replication/conflict.h

... which silently propagates a lot of headers into many places
via pgstat.h, as evidenced by the variety of headers that this patch
needs to add to seemingly random places. Add a minimum of typedefs to
conflict.h to be able to remove execnodes.h, and fix the fallout.

Backpatch to 18, where conflict.h first appeared.

Discussion: https://postgr.es/m/202509191927.uj2ijwmho7nv@alvherre.pgsql

Update some more forward declarations to use typedef

As commit d4d1fc527bdb.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/202509191025.22agk3fvpilc@alvherre.pgsql

pgbench: Fix typo in documentation.

This commit fixes a typo introduced in commit b6290ea48e1.

Reported off-list by Erik Rijkers <er@xs4all.nl>

pgbench: Clarify documentation for \gset and \aset.

This commit updates the pgbench documentation to list \gset and \aset
as separate terms for easier reading. It also clarifies that \gset raises
an error if the query returns zero or multiple rows, and explains how to
detect cases where the query with \aset returned no rows.

Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20250626180125.5b896902a3d0bcd93f86c240@sraoss.co.jp

vacuumdb: Do not run VACUUM (ONLY_DATABASE_STATS) when --analyze-only.

Previously, vacuumdb --analyze-only issued VACUUM (ONLY_DATABASE_STATS)
at the end. Since --analyze-only is meant to update optimizer statistics only,
this extra VACUUM command is unnecessary.

This commit prevents vacuumdb --analyze-only from running that redundant
VACUUM command.

Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Mircea Cadariu <cadariu.mircea@gmail.com>
Discussion: https://postgr.es/m/CAHGQGwEqHGa-k=wbRMucUVihHVXk4NQkK94GNN=ym9cQ5HBSHg@mail.gmail.com

Correct prune WAL record opcode name in comment

f83d709760d8 incorrectly refers to a XLOG_HEAP2_PRUNE_FREEZE WAL record
opcode. No such code exists. The relevant opcodes are
XLOG_HEAP2_PRUNE_ON_ACCESS, XLOG_HEAP2_PRUNE_VACUUM_SCAN, and
XLOG_HEAP2_PRUNE_VACUUM_CLEANUP. Correct it.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/yn4zp35kkdsjx6wf47zcfmxgexxt4h2og47pvnw2x5ifyrs3qc%407uw6jyyxuyf7

Ensure guc_tables.o's dependency on guc_tables.inc.c is known.

Without this, rebuilds can malfunction unless --enable-depend is used.
Historically we've expected that you can get away without
--enable-depend as long as you manually clean after changing *.h
files; the makefiles are supposed to handle other sorts of
dependencies. So add this one.

Follow-on to 635998965, so no need for back-patch.

Discussion: https://postgr.es/m/3121329.1758650878@sss.pgh.pa.us

Include pg_test_timing's full output in the TAP test log.

We were already doing a short (1-second) pg_test_timing run during
check-world and buildfarm runs. But we weren't doing anything
with the result except for a basic regex-based sanity check.
Collecting that output from buildfarm runs is seeming very
attractive though, because it would help us determine what sort
of timing resolution is available on supported platforms.
It's not very long, so let's just note it verbatim in the TAP log.

Discussion: https://postgr.es/m/3321785.1758728271@sss.pgh.pa.us

Fix incorrect and inconsistent comments in tableam.h and heapam.c.

This commit corrects several issues in function comments:

* The parameter "rel" was incorrectly referred to as "relation" in the comments
   for table_tuple_delete(), table_tuple_update(), and table_tuple_lock().
* In table_tuple_delete(), "changingPart" was listed as an output parameter
   in the comments but is actually input.
* In table_tuple_update(), "slot" was listed as an input parameter
   in the comments but is actually output.
* The comment for "update_indexes" in table_tuple_update() was mis-indented.
* The comments for heap_lock_tuple() incorrectly referenced a non-existent
   "tid" parameter.

Author: Chao Li <lic@highgo.com>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/CAEoWx2nB6Ay8g=KEn7L3qbYX_4+sLk9XOMkV0XZqHR4cTY8ZvQ@mail.gmail.com

Remove PointerIsValid()

This doesn't provide any value over the standard style of checking the
pointer directly or comparing against NULL.

Also remove related:
- AllocPointerIsValid() [unused]
- IndexScanIsValid() [had one user]
- HeapScanIsValid() [unused]
- InvalidRelation [unused]

Leaving HeapTupleIsValid(), ItemIdIsValid(), PortalIsValid(),
RelationIsValid for now, to reduce code churn.

Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/flat/ad50ab6b-6f74-4603-b099-1cd6382fb13d%40eisentraut.org
Discussion: https://www.postgresql.org/message-id/CA+hUKG+NFKnr=K4oybwDvT35dW=VAjAAfiuLxp+5JeZSOV3nBg@mail.gmail.com
Discussion: https://www.postgresql.org/message-id/bccf2803-5252-47c2-9ff0-340502d5bd1c@iki.fi

Fix incorrect option name in usage screen

The usage screen incorrectly refered to the --docs option as --sgml.
Backpatch down to v17 where this script was introduced.

Author: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20250729.135638.1148639539103758555.horikyota.ntt@gmail.com
Backpatch-through: 17

Consistently handle tab delimiters for wait event names

Format validation and element extraction for intermediate line
strings were inconsistent in their handling of tab delimiters,
which resulted in an unclear error when multiple tab characters
were used as a delimiter. This fixes it by using captures from
the validation regex instead of a separate split() to avoid the
inconsistency. Also, it ensures that \t+ is used consistently
when inspecting the strings.

Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/20250729.135638.1148639539103758555.horikyota.ntt@gmail.com

Update GB18030 encoding from version 2000 to 2022

Mappings for 18 characters have changed, affecting 36 code points. This
is a break in compatibility, but these characters are rarely used.

U+E5E5 (Private Use Area) was previously mapped to \xA3A0. This code
point now maps to \x65356535. Attempting to convert \xA3A0 will now
raise an error.

Separate from the 2022 update, the following mappings were previously
swapped, and subsequently corrected in 2000 and later versions:
* U+E7C7 (Private Use Area) now maps to \x8135F437
* U+1E3F (Latin Small Letter M with Acute) now maps to \xA8BC

The 2022 standard mentions the following policy changes, but they
have no effect in our implementation:

66 new ideographs are now required, but these are mapped
algorithmically so were already handled by utf8_and_gb18030.c.

Nine CJK compatibility ideographs are no longer required, but
implementations may retain them, as does the source we use from
the Unicode Consortium.

Release notes: Compatibility section

For further details, see:
https://www.unicode.org/L2/L2022/22274-disruptive-changes.pdf
https://ken-lunde.medium.com/the-gb-18030-2022-standard-3d0ebaeb4132

Author: Chao Li <lic@highgo.com>
Author: Zheng Tao <taoz@highgo.com>
Discussion: https://postgr.es/m/966d9fc.169.198741fe60b.Coremail.jiaoshuntian%40highgo.com

Fix LOCK_TIMEOUT handling during parallel apply.

Previously, the parallel apply worker used SIGINT to receive a graceful
shutdown signal from the leader apply worker. However, SIGINT is also used
by the LOCK_TIMEOUT handler to trigger a query-cancel interrupt. This
overlap caused the parallel apply worker to miss LOCK_TIMEOUT signals,
leading to incorrect behavior during lock wait/contention.

This patch resolves the conflict by switching the graceful shutdown signal
from SIGINT to SIGUSR2.

Reported-by: Zane Duffield <duffieldzane@gmail.com>
Diagnosed-by: Zhijie Hou <houzj.fnst@fujitsu.com>
Author: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Backpatch-through: 16, where it was introduced
Discussion: https://postgr.es/m/CACMiCkXyC4au74kvE2g6Y=mCEF8X6r-Ne_ty4r7qWkUjRE4+oQ@mail.gmail.com

Fix compiler warnings in test_bitmapset

The macros doing conversions of/from "text" from/to Bitmapset were using
arbitrary casts with Datum, something that is not fine since
2a600a93c7be.

These macros do not actually need casts with Datum, as they are given
already "text" and Bitmapset data in input. They are updated to use
cstring_to_text() and text_to_cstring(), fixing the compiler warnings
reported by the buildfarm. Note that appending a -m32 to gcc to trigger
32-bit builds was enough to reproduce the warnings here.

While on it, outer parenthesis are added to TEXT_TO_BITMAPSET(), and
inner parenthesis are removed from BITMAPSET_TO_TEXT(), to make these
macros more consistent with the style used in the tree, based on
suggestions by Tom Lane.

Oversights in commit 00c3d87a5cab.

Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Greg Burd <greg@burd.me>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/3027069.1758606227@sss.pgh.pa.us

Keep track of what RTIs a Result node is scanning.

Result nodes now include an RTI set, which is only non-NULL when they
have no subplan, and is taken from the relid set of the RelOptInfo that
the Result is generating. ExplainPreScanNode now takes notice of these
RTIs, which means that a few things get schema-qualified in the
regression tests that previously did not. This makes the output more
consistent between cases where some part of the plan tree is replaced by
a Result node and those where this does not happen.

Likewise, pg_overexplain's EXPLAIN (RANGE_TABLE) now displays the RTIs
stored in a Result node just as it already does for other RTI-bearing
node types.

Result nodes also now include a result_reason, which tells us something
about why the Result node was inserted. Using that information, EXPLAIN
now emits, where relevant, a "Replaces" line describing the origin of
a Result node.

The purpose of these changes is to allow code that inspects a Plan
tree to understand the origin of Result nodes that appear therein.

Discussion: http://postgr.es/m/CA+TgmoYeUZePZWLsSO+1FAN7UPePT_RMEZBKkqYBJVCF1s60=w@mail.gmail.com
Reviewed-by: Alexandra Wang <alexandra.wang.oss@gmail.com>
Reviewed-by: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Junwang Zhao <zhjwpku@gmail.com>