Andrew Dunstan [Sat, 14 Sep 2024 14:26:25 +0000 (10:26 -0400)]
Improve meson's detection of perl build flags
The current method of detecting perl build flags breaks if the path to
perl contains a space. This change makes two improvements. First,
instead of getting a list of ldflags and ccdlflags and then trying to
filter those out of the reported ldopts, we tell perl to suppress
reporting those in the first instance. Second, it tells perl to parse
those and output them, one per line. Thus any space on the option in a
file name, for example, is preserved.
Andrew Dunstan [Sat, 14 Sep 2024 12:37:08 +0000 (08:37 -0400)]
Only define NO_THREAD_SAFE_LOCALE for MSVC plperl when required
Latest versions of Strawberry Perl define USE_THREAD_SAFE_LOCALE, and we
therefore get a handshake error when building against such instances.
The solution is to perform a test to see if USE_THREAD_SAFE_LOCALE is
defined and only define NO_THREAD_SAFE_LOCALE if it isn't.
Backpatch the meson.build fix back to release 16 and apply the same
logic to Mkvcbuild.pm in releases 12 through 16.
Original report of the issue from Muralikrishna Bandaru.
Tom Lane [Fri, 13 Sep 2024 20:16:47 +0000 (16:16 -0400)]
Allow _h_indexbuild() to be interrupted.
When we are building a hash index that is large enough to need
pre-sorting (larger than either maintenance_work_mem or NBuffers),
the initial sorting phase is interruptible, but the insertion
phase wasn't. Add the missing CHECK_FOR_INTERRUPTS().
Per bug #18616 from Alexander Lakhin. Back-patch to all
supported branches.
I managed to break this test in two different ways in commit 05036a3155.
First, the output of the new call to tuple_data_split() on the test
sequence is dependent on endianness. This is fixed by setting a
special start value for the test sequence that produces the same
output regardless of the endianness of the machine.
Second, on versions older than v15, the new test case fails under
"force_parallel_mode = regress" with the following error:
ERROR: cannot access temporary tables during a parallel operation
This is because pageinspect's disk-accessing functions are
incorrectly marked PARALLEL SAFE on versions older than v15 (see
commit aeaaf520f4 for details). This one is fixed by changing the
test sequence to be permanent. The only reason it was previously
marked temporary was to avoid needing a DROP SEQUENCE command at
the end of the test. Unlike some other tests in this file, the use
of a permanent sequence here shouldn't result in any test
instability like what was fixed by commit e2933a6e11.
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/ZuOKOut5hhDlf_bP%40nathan
Backpatch-through: 12
Amit Langote [Fri, 13 Sep 2024 07:08:13 +0000 (16:08 +0900)]
SQL/JSON: Update example in JSON_QUERY() documentation
Commit 2c27346ed684 fixed the behavior of JSON_QUERY() when WITH
CONDITIONAL WRAPPER is used, but the documentation example wasn't
updated to reflect this change. This commit updates the example to
show the correct result.
Reintroduce support for sequences in pgstattuple and pageinspect.
Commit 4b82664156 restricted a number of functions provided by
contrib modules to only relations that use the "heap" table access
method. Sequences always use this table access method, but they do
not advertise as such in the pg_class system catalog, so the
aforementioned commit also (presumably unintentionally) removed
support for sequences from some of these functions. This commit
reintroduces said support for sequences to these functions and adds
a couple of relevant tests.
Co-authored-by: Ayush Vatsa Reviewed-by: Robert Haas, Michael Paquier, Matthias van de Meent
Discussion: https://postgr.es/m/CACX%2BKaP3i%2Bi9tdPLjF5JCHVv93xobEdcd_eB%2B638VDvZ3i%3DcQA%40mail.gmail.com
Backpatch-through: 12
Tom Lane [Thu, 12 Sep 2024 18:30:29 +0000 (14:30 -0400)]
Make jsonpath .string() be immutable for datetimes.
Discussion of commit ed055d249 revealed that we don't actually
want jsonpath's .string() method to depend on DateStyle, nor
TimeZone either, because the non-"_tz" jsonpath functions are
supposed to be immutable. Potentially we could allow a TimeZone
dependency in the "_tz" variants, but it seems better to just
uniformly define this method as returning the same string that
jsonb text output would do. That's easier to implement too,
saving a couple dozen lines.
Patch by me, per complaint from Peter Eisentraut. Back-patch
to v17 where this feature came in (in 66ea94e8e). Also
back-patch ed055d249 to provide test cases.
David Rowley [Thu, 12 Sep 2024 10:37:27 +0000 (22:37 +1200)]
Doc: alphabetize aggregate function table
A few recent JSON aggregates have been added without much consideration
to the existing order. Put these back in alphabetical order (with the
exception of the JSONB variant of each JSON aggregate).
Author: Wolfgang Walther <walther@technowledgy.de> Reviewed-by: Marlene Reiterer <marlene.reiterer.03@gmail.com>
Discussion: https://postgr.es/m/6a7b910c-3feb-4006-b817-9b4759cb6bb6%40technowledgy.de
Backpatch-through: 16, where these aggregates were added
Amit Langote [Thu, 12 Sep 2024 00:36:31 +0000 (09:36 +0900)]
SQL/JSON: Fix JSON_QUERY(... WITH CONDITIONAL WRAPPER)
Currently, when WITH CONDITIONAL WRAPPER is specified, array wrappers
are applied even to a single SQL/JSON item if it is a scalar JSON
value, but this behavior does not comply with the standard.
To fix, apply wrappers only when there are multiple SQL/JSON items
in the result.
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Peter Eisentraut <peter@eisentraut.org>
Author: Amit Langote <amitlangote09@gmail.com> Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/8022e067-818b-45d3-8fab-6e0d94d03626%40eisentraut.org
Backpatch-through: 17
Tom Lane [Wed, 11 Sep 2024 15:41:47 +0000 (11:41 -0400)]
Remove incorrect Assert.
check_agglevels_and_constraints() asserted that if we find an
aggregate function in an EXPR_KIND_FROM_SUBSELECT expression, the
expression must be in a LATERAL subquery. Alexander Lakhin found a
case where that's not so: because of the odd scoping rules for NEW/OLD
within a rule, a reference to NEW/OLD could cause an aggregate to be
considered top-level even though it's in an unmarked sub-select.
The error message that would be thrown seems sufficiently on-point,
so just remove the Assert. (Hence, this is not a bug for production
builds.)
This Assert was added by me in commit eaccfded9 (9.3 era). It looks
like I put it in to cross-check that the new logic for detecting
misplaced aggregates (using agglevelsup) caught the same cases that a
previous check on p_lateral_active did. So there might have been some
related misbehavior before eaccfded9 ... but that's very ancient
history by now, so I didn't dig any deeper.
Per bug #18608 from Alexander Lakhin. Back-patch to all supported
branches.
Tomas Vondra [Wed, 11 Sep 2024 11:21:05 +0000 (13:21 +0200)]
Fix unique key checks in JSON object constructors
When building a JSON object, the code builds a hash table of keys, to
allow checking if the keys are unique. The uniqueness check and adding
the new key happens in json_unique_check_key(), but this assumes the
pointer to the key remains valid.
Unfortunately, two places passed pointers to keys in a buffer, while
also appending more data (additional key/value pairs) to the buffer.
With enough data the buffer is resized by enlargeStringInfo(), which
calls repalloc(), invalidating the earlier key pointers.
Due to this the uniqueness check may fail with both false negatives and
false positives, producing JSON objects with duplicate keys or failing
to produce a perfectly valid JSON object.
This affects multiple functions that enforce uniqueness of keys, all
introduced in PG16 with the new SQL/JSON:
Existing regression tests did not detect the issue, simply because the
initial buffer size is 1024 and the objects were small enough not to
require the repalloc.
With a sufficiently large object, AddressSanitizer reported the access
to invalid memory immediately. So would valgrind, of course.
Fixed by copying the key into the hash table memory context, and adding
regression tests with enough data to repalloc the buffer. Backpatch to
16, where the functions were introduced.
Reported by Alexander Lakhin. Investigation and initial fix by Junwang
Zhao, with various improvements and tests by me.
Reported-by: Alexander Lakhin
Author: Junwang Zhao, Tomas Vondra
Backpatch-through: 16
Discussion: https://postgr.es/m/18598-3279ed972a2347c7@postgresql.org
Discussion: https://postgr.es/m/CAEG8a3JjH0ReJF2_O7-8LuEbO69BxPhYeXs95_x7+H9AMWF1gw@mail.gmail.com
Tom Lane [Tue, 10 Sep 2024 20:20:31 +0000 (16:20 -0400)]
Fix some whitespace issues in XMLSERIALIZE(... INDENT).
We must drop whitespace while parsing the input, else libxml2
will include "blank" nodes that interfere with the desired
indentation behavior. The end result is that we didn't indent
nodes separated by whitespace.
Also, it seems that libxml2 may add a trailing newline when working
in DOCUMENT mode. This is semantically insignificant, so strip it.
This is in the gray area between being a bug fix and a definition
change. However, the INDENT option is still pretty new (since v16),
so I think we can get away with changing this in stable branches.
Hence, back-patch to v16.
Amit Langote [Mon, 9 Sep 2024 02:55:38 +0000 (11:55 +0900)]
SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps
When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression. This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL. However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.
Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
Michael Paquier [Mon, 9 Sep 2024 04:49:59 +0000 (13:49 +0900)]
Fix waits of REINDEX CONCURRENTLY for indexes with predicates or expressions
As introduced by f9900df5f94, a REINDEX CONCURRENTLY job done for an
index with predicates or expressions would set PROC_IN_SAFE_IC in its
MyProc->statusFlags, causing it to be ignored by other concurrent
operations.
Such concurrent index rebuilds should never be ignored, as a predicate
or an expression could call a user-defined function that accesses a
different table than the table where the index is rebuilt.
A test that uses injection points is added, backpatched down to 17.
Michail has proposed a different test, but I have added something
simpler with more coverage.
Tom Lane [Fri, 6 Sep 2024 15:57:57 +0000 (11:57 -0400)]
Fix incorrect pg_stat_io output on 32-bit machines.
pg_stat_get_io() applied TimestampTzGetDatum twice to the
stat_reset_timestamp value. On 64-bit builds that's harmless because
TimestampTzGetDatum is a no-op, but on 32-bit builds it results in
displaying garbage in the stats_reset column of the pg_stat_io view.
Bug dates to commit a9c70b46d which introduced pg_stat_io, so
back-patch to v16 where that came in.
Amit Langote [Fri, 6 Sep 2024 04:25:14 +0000 (13:25 +0900)]
SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE
Use EMPTY ARRAY instead of EMPTY.
This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.
Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
Amit Langote [Fri, 6 Sep 2024 04:25:02 +0000 (13:25 +0900)]
SQL/JSON: Fix JSON_TABLE() column deparsing
The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.
Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
Amit Langote [Fri, 6 Sep 2024 03:04:29 +0000 (12:04 +0900)]
SQL/JSON: Avoid initializing unnecessary ON ERROR / ON EMPTY steps
When the ON ERROR / ON EMPTY behavior is to return NULL, returning
NULL directly from ExecEvalJsonExprPath() suffices. Therefore, there's
no need to create separate steps to check the error/empty flag or
those to evaluate the the constant NULL expression. This speeds up
common cases because the default ON ERROR / ON EMPTY behavior for
JSON_QUERY() and JSON_VALUE() is to return NULL. However, these steps
are necessary if the RETURNING type is a domain, as constraints on the
domain may need to be checked.
Reported-by: Jian He <jian.universality@gmail.com>
Author: Jian He <jian.universality@gmail.com>
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
Amit Langote [Fri, 6 Sep 2024 01:13:03 +0000 (10:13 +0900)]
SQL/JSON: Fix default ON ERROR behavior for JSON_TABLE
Use EMPTY ARRAY instead of EMPTY.
This change does not affect the runtime behavior of JSON_TABLE(),
which continues to return an empty relation ON ERROR. It only alters
whether the default ON ERROR behavior is shown in the deparsed output.
Reported-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
Amit Langote [Fri, 6 Sep 2024 01:12:16 +0000 (10:12 +0900)]
SQL/JSON: Fix JSON_TABLE() column deparsing
The deparsing code in get_json_expr_options() unnecessarily emitted
the default column-specific ON ERROR / EMPTY behavior when the
top-level ON ERROR behavior in JSON_TABLE was set to ERROR. Fix that
by not overriding the column-specific default, determined based on
the column's JsonExprOp in get_json_table_columns(), with
JSON_BEHAVIOR_ERROR when that is the top-level ON ERROR behavior.
Note that this only removes redundancy; the current deparsing output
is not incorrect, just redundant.
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com
Backpatch-through: 17
Tom Lane [Thu, 5 Sep 2024 16:42:33 +0000 (12:42 -0400)]
Prevent mis-encoding of "trailing junk after numeric literal" errors.
Since commit 2549f0661, we reject an identifier immediately following
a numeric literal (without separating whitespace), because that risks
ambiguity with hex/octal/binary integers. However, that patch used
token patterns like "{integer}{ident_start}", which is problematic
because {ident_start} matches only a single byte. If the first
character after the integer is a multibyte character, this ends up
with flex reporting an error message that includes a partial multibyte
character. That can cause assorted bad-encoding problems downstream,
both in the report to the client and in the postmaster log file.
To fix, use {identifier} not {ident_start} in the "junk" token
patterns, so that they will match complete multibyte characters.
This seems generally better user experience quite aside from the
encoding problem: for "123abc" the error message will now say that
the error appeared at or near "123abc" instead of "123a".
While at it, add some commentary about why these patterns exist
and how they work.
Report and patch by Karina Litskevich; review by Pavel Borisov.
Back-patch to v15 where the problem came in.
Michael Paquier [Wed, 4 Sep 2024 01:22:19 +0000 (10:22 +0900)]
Fix inconsistent LWLock tranche name "CommitTsSLRU"
This term was using an inconsistent casing between the code and the
documentation, using "CommitTsSLRU" in wait_event_names.txt and
"CommitTSSLRU" in the code.
Let's update the term in the code to reflect what's in the
documentation, "CommitTs" being more commonly used, so as
pg_stat_activity shows the same term as the documentation.
Michael Paquier [Tue, 3 Sep 2024 23:56:28 +0000 (08:56 +0900)]
Avoid installcheck failure in TAP tests using injection_points
These tests depend on the test module injection_points to be installed,
but it may not be available as the contents of src/test/modules/ are not
installed by default.
This commit adds a workaround based on a scan of pg_available_extensions
to check if the extension is available, skipping the test if it is not.
This allows installcheck to work transparently.
There are more tests impacted by this problem on HEAD, but for now this
addresses only the tests that exist on HEAD and v17 as the release is
close by.
Thomas Munro [Sat, 31 Aug 2024 05:27:38 +0000 (17:27 +1200)]
Fix unfairness in all-cached parallel seq scan.
Commit b5a9b18c introduced block streaming infrastructure with a special
fast path for all-cached scans, and commit b7b0f3f2 connected the
infrastructure up to sequential scans. One of the fast path
micro-optimizations had an unintended consequence: it interfered with
parallel sequential scan's block range allocator (from commit 56788d21),
which has its own ramp-up and ramp-down algorithm when handing out
groups of pages to workers. A scan of an all-cached table could give
extra blocks to one worker, when others had finished. In some plans
(probably already very bad plans, such as the one reported by
Alexander), the unfairness could be magnified.
An internal buffer of 16 block numbers is removed, keeping just a single
block buffer for technical reasons.
Back-patch to 17.
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/63a63690-dd92-c809-0b47-af05459e95d1%40gmail.com
Thomas Munro [Sat, 31 Aug 2024 02:32:08 +0000 (14:32 +1200)]
Stabilize 039_end_of_wal test.
The first test was sensitive to the insert LSN after setting up the
catalogs, which depended on environmental things like the locales on the
OS and usernames. Switch to a new WAL file before the first test, as a
simple way to put every computer into the same state.
Back-patch to all supported releases.
Reported-by: Anton Voloshin <a.voloshin@postgrespro.ru> Reported-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://postgr.es/m/b26aeac2-cb6d-4633-a7ea-945baae83dcf%40postgrespro.ru
Tom Lane [Fri, 30 Aug 2024 20:47:39 +0000 (16:47 -0400)]
Make postgres_fdw's query_cancel test less flaky.
This test occasionally shows
+WARNING: could not get result of cancel request due to timeout
which appears to be because the cancel request is sometimes unluckily
sent to the remote session between queries, and then it's ignored.
This patch tries to make that less probable in three ways:
1. Use a test query that does not involve remote estimates, so that
no EXPLAINs are sent.
2. Make sure that the remote session is ready-to-go (transaction
started, SET commands sent) before we start the timer.
3. Increase the statement_timeout to 100ms, to give the local
session enough time to plan and issue the query.
We might have to go higher than 100ms to make this adequately
stable in the buildfarm, but let's see how it goes.
Tom Lane [Fri, 30 Aug 2024 16:42:13 +0000 (12:42 -0400)]
Avoid inserting PlaceHolderVars in cases where pre-v16 PG did not.
Commit 2489d76c4 removed some logic from pullup_replace_vars()
that avoided wrapping a PlaceHolderVar around a pulled-up
subquery output expression if the expression could be proven
to go to NULL anyway (because it contained Vars or PHVs of the
pulled-up relation and did not contain non-strict constructs).
But removing that logic turns out to cause performance regressions
in some cases, because the extra PHV blocks subexpression folding,
and will do so even if outer-join reduction later turns it into a
no-op with no phnullingrels bits. This can for example prevent
an expression from being matched to an index.
The reason for always adding a PHV was to ensure we had someplace
to put the varnullingrels marker bits of the Var being replaced.
However, it turns out we can optimize in exactly the same cases that
the previous code did, because we can instead attach the needed
varnullingrels bits to the contained Var(s)/PHV(s).
This is not a complete solution --- it would be even better if we
could remove PHVs after reducing them to no-ops. It doesn't look
practical to back-patch such an improvement, but this change seems
safe and at least gets rid of the performance-regression cases.
Per complaint from Nikhil Raj. Back-patch to v16 where the
problem appeared.
Tom Lane [Thu, 29 Aug 2024 17:24:17 +0000 (13:24 -0400)]
Fix mis-deparsing of ORDER BY lists when there is a name conflict.
If an ORDER BY item in SELECT is a bare identifier, the parser
first seeks it as an output column name of the SELECT (for SQL92
compatibility). However, ruleutils.c is expecting the SQL99
interpretation where such a name is an input column name. So it's
possible to produce an incorrect display of a view in the (admittedly
pretty ill-advised) case where some other column is renamed in the
SELECT output list to match an ORDER BY column.
This can be fixed by table-qualifying such names in the dumped
view text. To avoid cluttering less-ill-advised queries, we'd
like to do so only when there's an actual name conflict.
That requires passing the current get_query_def call's resultDesc
parameter down to get_variable, so that it can determine what
the output column names are. In hopes of reducing rather than
increasing notational clutter in ruleutils.c, I moved that value
into the deparse_context struct and removed it from the parameter
lists of get_query_def's other subroutines.
I made a few other cosmetic changes while at it:
* Likewise move the colNamesVisible parameter into deparse_context.
* Rename deparse_context's windowTList field to targetList,
since it's no longer used only in connection with WINDOW clauses.
* Replace the special_exprkind field with a bool inGroupBy,
since that was all it was being used for, and the apparent
flexibility of storing a ParseExprKind proved to be illusory.
(We need a separate varInOrderBy field to make this patch work.)
* Remove useless save/restore logic in get_select_query_def.
In principle, this bug is quite old. However, it seems unreachable
before 1b4d280ea, because before that the presence of "new" and "old"
entries in a view's rangetable caused us to always table-qualify every
Var reference in dumped views. Hence, back-patch to v16 where that
came in.
Peter Eisentraut [Thu, 29 Aug 2024 06:38:29 +0000 (08:38 +0200)]
Disallow USING clause when altering type of generated column
This does not make sense. It would write the output of the USING
clause into the converted column, which would violate the generation
expression. This adds a check to error out if this is specified.
There was a test for this, but that test errored out for a different
reason, so it was not effective.
Reported-by: Jian He <jian.universality@gmail.com> Reviewed-by: Yugo NAGATA <nagata@sraoss.co.jp>
Discussion: https://www.postgresql.org/message-id/flat/c7083982-69f4-4b14-8315-f9ddb20b9834%40eisentraut.org
Amit Kapila [Thu, 29 Aug 2024 03:15:41 +0000 (08:45 +0530)]
Doc: Fix the ambiguity in the description of failover slots.
The failover slots ensure a seamless transition of a subscriber after the
standby is promoted. But the docs for it also explain the behavior of
asynchronous replication which can confuse the readers.
Masahiko Sawada [Mon, 26 Aug 2024 18:00:04 +0000 (11:00 -0700)]
Fix memory counter update in ReorderBuffer.
Commit 5bec1d6bc5e changed the memory usage updates of the
ReorderBufferTXN to zero all at once by subtracting txn->size, rather
than updating it for each change. However, if TOAST reconstruction
data remained in the transaction when freeing it, there were cases
where it further subtracted the memory counter from zero, resulting in
an assertion failure.
This change calculates the memory size for each change and updates the
memory usage to precisely the amount that has been freed.
Backpatch to v17, where this was introducd.
Reviewed-by: Amit Kapila, Shlok Kyal
Discussion: https://postgr.es/m/CAD21AoAqkNUvicgKPT_dXzNoOwpPkVTg0QPPxEcWmzT0moCJ1g%40mail.gmail.com
Backpatch-through: 17
Peter Geoghegan [Mon, 26 Aug 2024 15:29:13 +0000 (11:29 -0400)]
Fix nbtree lookahead overflow bug.
Add bounds checking to nbtree's lookahead/skip-within-a-page mechanism.
Otherwise it's possible for cases with lots of before-array-keys tuples
to overflow an int16 variable, causing the mechanism to generate an out
of bounds page offset number.
Oversight in commit 5bf748b8, which enhanced nbtree ScalarArrayOp
execution.
Reported-By: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/6c68ac42-bbb5-8b24-103e-af0e279c536f@gmail.com
Backpatch: 17-, where nbtree SAOP execution was enhanced.
The reason for reverting is security issues related to repeatable name lookups
(CVE-2014-0062). Even though 04158e7fa3 solved part of the problem, there
are still remaining issues, which aren't feasible to even carefully analyze
before the RC deadline.
Reported-by: Noah Misch, Robert Haas
Discussion: https://postgr.es/m/20240808171351.a9.nmisch%40google.com
Backpatch-through: 17
Tom Lane [Fri, 23 Aug 2024 14:12:56 +0000 (10:12 -0400)]
Provide feature-test macros for libpq features added in v17.
As per the policy established in commit 6991e774e, invent macros
that can be tested at compile time to detect presence of new libpq
features. This should make calling code more readable and less
error-prone than checking the libpq version would be (especially
since we don't expose that at compile time; the server version is
an unreliable substitute).
Noah Misch [Thu, 22 Aug 2024 07:07:04 +0000 (00:07 -0700)]
Fix attach of a previously-detached injection point.
It's normal for the name in a free slot to match the new name. The
max_inuse mechanism kept simple cases from reaching the problem. The
problem could appear when index 0 was the previously-detached entry and
index 1 is in use. Back-patch to v17, where this code first appeared.
Avoid repeated table name lookups in createPartitionTable()
Currently, createPartitionTable() opens newly created table using its name.
This approach is prone to privilege escalation attack, because we might end
up opening another table than we just created.
This commit address the issue above by opening newly created table by its
OID. It appears to be tricky to get a relation OID out of ProcessUtility().
We have to extend TableLikeClause with new newRelationOid field, which is
filled within ProcessUtility() to be further accessed by caller.
Tom Lane [Wed, 21 Aug 2024 16:00:03 +0000 (12:00 -0400)]
Disallow creating binary-coercible casts involving range types.
For a long time we have forbidden binary-coercible casts to or from
composite and array types, because such a cast cannot work correctly:
the type OID embedded in the value would need to change, but it won't
in a binary coercion. That reasoning applies equally to range types,
but we overlooked installing a similar restriction here when we
invented range types. Do so now.
Given the lack of field complaints, we won't change this in stable
branches, but it seems not too late for v17.
Per discussion of a problem noted by Peter Eisentraut.
Peter Eisentraut [Wed, 21 Aug 2024 13:11:21 +0000 (15:11 +0200)]
doc: remove llvm-config search from configure documentation
As of 4dd29b6833, we no longer attempt to locate any other llvm-config
variant than plain llvm-config in configure-based builds; update the
documentation accordingly. (For Meson-based builds, we still use Meson's
LLVMDependencyConfigTool [0], which runs through a set of possible
suffixes [1], so no need to update the documentation there.)
Amit Kapila [Wed, 21 Aug 2024 03:38:16 +0000 (09:08 +0530)]
Don't advance origin during apply failure.
We advance origin progress during abort on successful streaming and
application of ROLLBACK in parallel streaming mode. But the origin
shouldn't be advanced during an error or unsuccessful apply due to
shutdown. Otherwise, it will result in a transaction loss as such a
transaction won't be sent again by the server.
Reported-by: Hou Zhijie
Author: Hayato Kuroda and Shveta Malik Reviewed-by: Amit Kapila
Backpatch-through: 16
Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com
Nathan Bossart [Tue, 20 Aug 2024 18:43:20 +0000 (13:43 -0500)]
Fix a couple of wait event descriptions.
The descriptions for ProcArrayGroupUpdate and XactGroupUpdate claim
that these events mean we are waiting for the group leader "at end
of a parallel operation," but neither pertains to parallel
operations. This commit reverts these descriptions to their
wording before commit 3048898e73, i.e., "end of a parallel
operation" is changed to "transaction end."
Author: Sameer Kumar Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAGPeHmh6UMrKQHKCmX%2B5vV5TH9P%3DKw9en3k68qEem6J%3DyrZPUA%40mail.gmail.com
Backpatch-through: 13
Alvaro Herrera [Mon, 19 Aug 2024 20:09:10 +0000 (16:09 -0400)]
Avoid failure to open dropped detached partition
When a partition is detached and immediately dropped, a prepared
statement could try to compute a new partition descriptor that includes
it. This leads to this kind of error:
ERROR: could not open relation with OID 457639
Avoid this by skipping the partition in expand_partitioned_rtentry if it
doesn't exist.
Noted by me while investigating bug #18559. Kuntal Gosh helped to
identify the exact failure.
Backpatch to 14, where DETACH CONCURRENTLY was introduced.
Tomas Vondra [Mon, 19 Aug 2024 11:31:51 +0000 (13:31 +0200)]
Explain dropdb can't use syscache because of TOAST
Add a comment explaining dropdb() can't rely on syscache. The issue with
flattened rows was fixed by commit 0f92b230f88b, but better to have
a clear explanation why the systable scan is necessary. The other places
doing in-place updates on pg_database have the same comment.
Suggestion and patch by Yugo Nagata. Backpatch to 12, same as the fix.
Commit 274bbced disabled session tickets for TLSv1.3 on top of the
already disabled TLSv1.2 session tickets, but accidentally caused
a regression where TLSv1.2 session tickets were incorrectly sent.
Fix by unconditionally disabling TLSv1.2 session tickets and only
disable TLSv1.3 tickets when the right version of OpenSSL is used.
Backpatch to all supported branches.
Reported-by: Cameron Vogt <cvogt@automaticcontrols.net> Reported-by: Fire Emerald <fire.github@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/DM6PR16MB3145CF62857226F350C710D1AB852@DM6PR16MB3145.namprd16.prod.outlook.com
Backpatch-through: v12
Thomas Munro [Mon, 19 Aug 2024 09:21:03 +0000 (21:21 +1200)]
Fix harmless LC_COLLATE[_MASK] confusion.
Commit ca051d8b101 called newlocale(LC_COLLATE, ...) instead of
newlocale(LC_COLLATE_MASK, ...), in code reached only on FreeBSD. They
have the same value on that OS, explaining why it worked. Fix.
Michael Paquier [Mon, 19 Aug 2024 03:34:52 +0000 (12:34 +0900)]
Fix more holes with SLRU code in need of int64 for segment numbers
This is a continuation of c9e24573905b, containing changes included into
the proposed patch that have been missed in the actual commit. I have
managed to miss these diffs while doing a rebase of the original patch.
Thanks to Noah Misch, Peter Eisentraut and Alexander Korotkov for the
pokes.
Thomas Munro [Sun, 18 Aug 2024 23:47:37 +0000 (11:47 +1200)]
ci: Upgrade MacPorts version to 2.10.1.
MacPorts version 2.9.3 started failing in our ci_macports_packages.sh
script, for reasons not fully determined, but plausibly linked to the
release of 2.10.1. 2.10.1 seems to work, so let's switch to it.
Back-patch to 15, where CI began.
Reported-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/81f104e8-f0a9-43c0-85bd-2bbbf590a5b8%40eisentraut.org
Tomas Vondra [Sun, 18 Aug 2024 22:04:41 +0000 (00:04 +0200)]
Fix DROP DATABASE for databases with many ACLs
Commit c66a7d75e652 modified DROP DATABASE so that if interrupted, the
database is known to be in an invalid state and can only be dropped.
This is done by setting a flag using an in-place update, so that it's
not lost in case of rollback.
For databases with many ACLs, this may however fail like this:
ERROR: wrong tuple length
This happens because with many ACLs, the pg_database.datacl attribute
gets TOASTed. The dropdb() code reads the tuple from the syscache, which
means it's detoasted. But the in-place update expects the tuple length
to match the on-disk tuple.
Fixed by reading the tuple from the catalog directly, not from syscache.
Report and fix by Ayush Tiwari. Backpatch to 12. The DROP DATABASE fix
was backpatched to 11, but 11 is EOL at this point.
Alvaro Herrera [Mon, 12 Aug 2024 22:17:56 +0000 (18:17 -0400)]
Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by
DROP, this could cause a problem for a concurrent transaction
recomputing the partition descriptor when running a prepared statement,
because it tries to dereference a pointer to a tuple that's not found in
a catalog scan.
The existing retry logic added in commit dbca3469ebf8 is sufficient to
cope with the overall problem, provided we don't try to dereference a
non-existant heap tuple.
Arguably, the code in RelationBuildPartitionDesc() has been wrong all
along, since no check was added in commit 898e5e3290a7 against receiving
a NULL tuple from the catalog scan; that bug has only become
user-visible with DETACH CONCURRENTLY which was added in branch 14.
Therefore, even though there's no known mechanism to cause a crash
because of this, backpatch the addition of such a check to all supported
branches. In branches prior to 14, this would cause the code to fail
with a "missing relpartbound for relation XYZ" error instead of
crashing; that's okay, because there are no reports of such behavior
anyway.
Tom Lane [Mon, 12 Aug 2024 17:18:36 +0000 (13:18 -0400)]
Log more info when wait-for-catchup tests time out.
Cluster.pm's wait_for_catchup and allied subroutines don't provide
enough information to diagnose the problem when a wait times out.
In hopes of debugging some intermittent buildfarm failures, let's
dump the ending state of the relevant system view when that happens.
Tom Lane [Sun, 11 Aug 2024 16:24:56 +0000 (12:24 -0400)]
Suppress Coverity warnings about Asserts in get_name_for_var_field.
Coverity thinks dpns->plan could be null at these points. That
shouldn't really be possible, but it's easy enough to modify the
Asserts so they'd not core-dump if it were true.
These are new in b919a97a6. Back-patch to v13; the v12 version
of the patch didn't have these Asserts.
Tom Lane [Sat, 10 Aug 2024 19:51:28 +0000 (15:51 -0400)]
Allow adjusting session_authorization and role in parallel workers.
The code intends to allow GUCs to be set within parallel workers
via function SET clauses, but not otherwise. However, doing so fails
for "session_authorization" and "role", because the assign hooks for
those attempt to set the subsidiary "is_superuser" GUC, and that call
falls foul of the "not otherwise" prohibition. We can't switch to
using GUC_ACTION_SAVE for this, so instead add a new GUC variable
flag GUC_ALLOW_IN_PARALLEL to mark is_superuser as being safe to set
anyway. (This is okay because is_superuser has context PGC_INTERNAL
and thus only hard-wired calls can change it. We'd need more thought
before applying the flag to other GUCs; but maybe there are other
use-cases.) This isn't the prettiest fix perhaps, but other
alternatives we thought of would be much more invasive.
While here, correct a thinko in commit 059de3ca4: when rejecting
a GUC setting within a parallel worker, we should return 0 not -1
if the ereport doesn't longjmp. (This seems to have no consequences
right now because no caller cares, but it's inconsistent.) Improve
the comments to try to forestall future confusion of the same kind.
Despite the lack of field complaints, this seems worth back-patching.
Thanks to Nathan Bossart for the idea to invent a new flag,
and for review.
John Naylor [Tue, 6 Aug 2024 13:38:33 +0000 (20:38 +0700)]
Lower minimum maintenance_work_mem to 64kB
Since the introduction of TID store, vacuum uses far less memory in
the common case than in versions 16 and earlier. Invoking multiple
rounds of index vacuuming in turn requires a much larger table. It'd
be a good idea anyway to cover this case in regression testing, and a
lower limit is less painful for slow buildfarm animals. The reason to
do it now is to re-enable coverage of the bugfix in commit 83c39a1f7f.
For consistency, give autovacuum_work_mem the same treatment.
Suggested by Andres Freund
Tested by Melanie Plageman
Backpatch to v17, where TID store was introduced
Tom Lane [Fri, 9 Aug 2024 15:21:39 +0000 (11:21 -0400)]
Fix "failed to find plan for subquery/CTE" errors in EXPLAIN.
To deparse a reference to a field of a RECORD-type output of a
subquery, EXPLAIN normally digs down into the subquery's plan to try
to discover exactly which anonymous RECORD type is meant. However,
this can fail if the subquery has been optimized out of the plan
altogether on the grounds that no rows could pass the WHERE quals,
which has been possible at least since 3fc6e2d7f. There isn't
anything remaining in the plan tree that would help us, so fall back
to printing the field name as "fN" for the N'th column of the record.
(This will actually be the right thing some of the time, since it
matches the column names we assign to RowExprs.)
In passing, fix a comment typo in create_projection_plan, which
I noticed while experimenting with an alternative fix for this.
Per bug #18576 from Vasya B. Back-patch to all supported branches.
Alvaro Herrera [Thu, 8 Aug 2024 23:35:13 +0000 (19:35 -0400)]
Refuse ATTACH of a table referenced by a foreign key
Trying to attach a table as a partition which is already on the
referenced side of a foreign key on the partitioned table that it is
being attached to, leads to strange behavior: we try to clone the
foreign key from the parent to the partition, but this new FK points to
the partition itself, and the mix of pg_constraint rows and triggers
doesn't behave well.
Rather than trying to untangle the mess (which might be possible given
sufficient time), I opted to forbid the ATTACH. This doesn't seem a
problematic restriction, given that we already fail to create the
foreign key if you do it the other way around, that is, having the
partition first and the FK second.
Backpatch to all supported branches.
Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/18541-628a61bc267cd2d3@postgresql.org
Alvaro Herrera [Thu, 8 Aug 2024 19:17:11 +0000 (15:17 -0400)]
Refactor error messages to reduce duplication
I also took the liberty of changing
errmsg("COPY DEFAULT only available using COPY FROM")
to
errmsg("COPY %s cannot be used with %s", "DEFAULT", "COPY TO")
because the original wording is unlike all other messages that indicate
option incompatibility. This message was added by commit 9f8377f7a279
(16-era), in whose development thread there was no discussion on this
point.
Fix pg_rewind debug output to print the source timeline history
getTimelineHistory() is called twice, to read the source and the
target timeline history files. However, the loop to print the file
with the --debug option used the wrong variable when dealing with the
source. As a result, the source's history was always printed as empty.
Spotted while debugging bug #18575, but this does not fix that bug,
just the debugging output. Backpatch to all supported versions.
Commit 0b9466fce added a dependency on fe_memutils' pnstrdup() inside
informix.c. This adds an exit() path in a library, which we don't
want. (Unlike libpq, the ecpg libraries don't have an automated check
for that, but it makes sense to keep them to a similar standard.) The
ecpg code can already handle failure results from the *strdup() call
by itself.
Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://www.postgresql.org/message-id/CAOYmi+=pg=W5L1h=3MEP_EB24jaBu2FyATrLXqQHGe7cpuvwyg@mail.gmail.com
Tom Lane [Wed, 7 Aug 2024 16:54:39 +0000 (12:54 -0400)]
Fix edge case in plpgsql's make_callstmt_target().
If the plancache entry for the CALL statement is already stale,
it's possible for us to fetch an old procedure OID out of it,
and then fail with "cache lookup failed for function NNN".
In ordinary usage this never happens because make_callstmt_target
is called just once immediately after building the plancache
entry. It can be forced however by setting up an erroneous CALL
(that causes make_callstmt_target itself to report an error),
then dropping/recreating the target procedure, then repeating
the erroneous CALL.
To fix, use SPI_plan_get_cached_plan() to fetch the plancache's
plan, rather than assuming we can use SPI_plan_get_plan_sources().
This shouldn't add any noticeable overhead in the normal case,
and in the stale-plan case we'd have had to replan anyway a little
further down.
The other callers of SPI_plan_get_plan_sources() seem OK, because
either they don't need up-to-date plans or they know that the
query was just (re) planned. But add some commentary in hopes
of not falling into this trap again.
Per bug #18574 from Song Hongyu. Back-patch to v14 where this coding
was introduced. (Older branches have comparable code, but it's run
after any required replanning, so there's no issue.)