:ticket:`12948`
+.. _change_13014_postgresql:
+
+Support for Server-Side Monotonic Functions such as uuidv7() in Batched INSERT Operations
+------------------------------------------------------------------------------------------
+
+SQLAlchemy 2.1 adds support for using monotonic server-side functions, such as
+PostgreSQL 18's ``uuidv7()`` function, as sentinels in the
+:ref:`engine_insertmanyvalues` feature. This allows these functions to work
+efficiently with batched INSERT operations while maintaining deterministic row
+ordering.
+
+When using a monotonic function as a default value, the ``monotonic=True``
+parameter must be passed to the function to indicate that it produces
+monotonically increasing values. This enables SQLAlchemy to use the function's
+values to correlate RETURNING results with input parameter sets::
+
+ from sqlalchemy import Table, Column, MetaData, UUID, Integer, func
+
+ metadata = MetaData()
+
+ t = Table(
+ "t",
+ metadata,
+ Column("id", UUID, server_default=func.uuidv7(monotonic=True), primary_key=True),
+ Column("x", Integer),
+ )
+
+With the above configuration, when performing a batched INSERT with RETURNING
+on PostgreSQL, SQLAlchemy will generate SQL that properly orders the rows
+while allowing the server to generate the UUID values:
+
+.. sourcecode:: sql
+
+ INSERT INTO t (x) SELECT p0::INTEGER FROM
+ (VALUES (%(x__0)s, 0), (%(x__1)s, 1), (%(x__2)s, 2), ...)
+ AS imp_sen(p0, sen_counter) ORDER BY sen_counter
+ RETURNING t.id, t.id AS id__1
+
+The returned rows are then sorted by the monotonically increasing UUID values
+to match the order of the input parameters, ensuring that ORM objects and
+returned values are properly correlated.
+
+This feature works with both :paramref:`_schema.Column.server_default` (for
+DDL-level defaults) and :paramref:`_schema.Column.default` (for ad-hoc
+server-side function calls).
+
+.. seealso::
+
+ :ref:`engine_insertmanyvalues_monotonic_functions` - Complete documentation
+ on using monotonic functions
+
+ :ref:`postgresql_monotonic_functions` - PostgreSQL-specific examples
+
+:ticket:`13014`
+
Microsoft SQL Server
====================
--- /dev/null
+.. change::
+ :tags: feature, postgresql, sql
+ :tickets: 13014
+
+ Added support for monotonic server-side functions such as PostgreSQL 18's
+ ``uuidv7()`` to work with the :ref:`engine_insertmanyvalues` feature.
+ By passing ``monotonic=True`` to any :class:`.Function`, the function can
+ be used as a sentinel for tracking row order in batched INSERT operations
+ with RETURNING, allowing the ORM and Core to efficiently batch INSERT
+ statements while maintaining deterministic row ordering.
+
+ .. seealso::
+
+ :ref:`change_13014_postgresql`
+
+ :ref:`engine_insertmanyvalues_monotonic_functions`
+
+ :ref:`postgresql_monotonic_functions`
including sample performance tests
.. tip:: The :term:`insertmanyvalues` feature is a **transparently available**
- performance feature which requires no end-user intervention in order for
- it to take place as needed. This section describes the architecture
- of the feature as well as how to measure its performance and tune its
- behavior in order to optimize the speed of bulk INSERT statements,
+ performance feature which typically requires no end-user intervention in
+ order for it to take place as needed. This section describes the
+ architecture of the feature as well as how to measure its performance and
+ tune its behavior in order to optimize the speed of bulk INSERT statements,
particularly as used by the ORM.
As more databases have added support for INSERT..RETURNING, SQLAlchemy has
below).
-
-
.. _engine_insertmanyvalues_sentinel_columns:
Configuring Sentinel Columns
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In typical cases, the "insertmanyvalues" feature in order to provide
INSERT..RETURNING with deterministic row order will automatically determine a
integer column named "_sentinel" that can be used by the "insertmanyvalues"
feature to help optimize bulk inserts used by the ORM.
+.. _engine_insertmanyvalues_monotonic_functions:
+
+Configuring Monotonic Functions such as UUIDV7
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Using a monotonic function such as uuidv7 is supported by the "insertmanyvalues"
+feature most easily by establishing the function as a client side callable,
+e.g. using Python's built-in ``uuid.uuid7()`` call by providing the callable
+to the :paramref:`_schema.Connection.default` parameter::
+
+ import uuid
+
+ from sqlalchemy import UUID, Integer
+
+ t = Table(
+ "t",
+ metadata,
+ Column("id", UUID, default=uuid.uuid7, primary_key=True),
+ Column("x", Integer),
+ )
+
+In the above example, SQLAlchemy will invoke Python's ``uuid.uuid7()`` function
+to create new primary key identifiers, which will be batchable by the
+"insertmanyvalues" feature.
+
+However, some databases like PostgreSQL provide a server-side function for
+uuid7 called ``uuidv7()``; in SQLAlchemy, this would be available from the
+:data:`_sql.func` namespace as ``func.uuidv7()``, and may be configured on a
+:class:`.Column` using either :paramref:`_schema.Connection.default` to allow
+it to be called as needed, or :paramref:`_schema.Connection.server_default` to
+establish it as part of the table's DDL. However, for full batched "insertmanyvalues"
+behavior including support for sorted RETURNING (as would allow the ORM to
+most effectively batch INSERT statements), an additional directive must be
+included indicating that the function produces
+monotonically increasing values, which is the ``monotonic=True`` directive.
+This is illustrated below as a DDL server default using
+:paramref:`_schema.Connection.server_default`::
+
+ from sqlalchemy import func, Integer
+
+ t = Table(
+ "t",
+ metadata,
+ Column("id", UUID, server_default=func.uuidv7(monotonic=True), primary_key=True),
+ Column("x", Integer),
+ )
+
+Using the above form, a batched INSERT...RETURNING on PostgreSQL with
+:paramref:`.UpdateBase.returning.sort_by_parameter_order` set to True will
+look like:
+
+.. sourcecode:: sql
+
+ INSERT INTO t (x) SELECT p0::INTEGER FROM
+ (VALUES (%(x__0)s, 0), (%(x__1)s, 1), (%(x__2)s, 2), ...)
+ AS imp_sen(p0, sen_counter) ORDER BY sen_counter
+ RETURNING t.id, t.id AS id__1
+
+Similarly if the function is configured as an ad-hoc server side function
+using :paramref:`_schema.Connection.default`::
+
+ t = Table(
+ "t",
+ metadata,
+ Column("id", UUID, default=func.uuidv7(monotonic=True), primary_key=True),
+ Column("x", Integer),
+ )
+
+The function will then be rendered in the SQL statement explicitly:
+
+.. sourcecode:: sql
+
+ INSERT INTO t (id, x) SELECT uuidv7(), p1::INTEGER FROM
+ (VALUES (%(x__0)s, 0), (%(x__1)s, 1), (%(x__2)s, 2), ...)
+ AS imp_sen(p1, sen_counter) ORDER BY sen_counter
+ RETURNING t.id, t.id AS id__1
+
+.. versionadded:: 2.1 Added support for explicit monotonic server side functions
+ using ``monotonic=True`` with any :class:`.Function`.
+
+.. seealso::
+
+ :ref:`postgresql_monotonic_functions`
+
.. _engine_insertmanyvalues_page_size:
PRIMARY KEY (id)
)
+.. _postgresql_monotonic_functions:
+
+PostgreSQL 18 and above UUID with uuidv7 as a server default
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+PostgreSQL 18's ``uuidv7`` SQL function is available as any other
+SQL function using the :data:`_sql.func` namespace::
+
+ >>> from sqlalchemy import select, func
+ >>> print(select(func.uuidv7()))
+ SELECT uuidv7() AS uuidv7_1
+
+When using ``func.uuidv7()`` as a default on a :class:`.Column` using either
+Core or ORM, an extra directive ``monotonic=True`` may be passed which
+indicates this function produces monotonically increasing values; this in turn
+allows Core and ORM to use a more efficient batched form of INSERT for large
+insert operations::
+
+ import uuid
+
+
+ class MyClass(Base):
+ __tablename__ = "my_table"
+
+ id: Mapped[uuid.UUID] = mapped_column(
+ server_default=func.uuidv7(monotonic=True)
+ )
+
+With the above mapping, the ORM will be able to efficiently batch rows when
+running bulk insert operations using the :ref:`engine_insertmanyvalues`
+feature.
+
+.. versionadded:: 2.1 Added ``monotonic=True`` to allow functions like PostgreSQL's
+ ``uuidv7()`` to work with batched "insertmanyvalues"
+
+.. seealso::
+
+ :ref:`engine_insertmanyvalues_monotonic_functions`
+
+
.. _postgresql_ss_cursors:
Server Side Cursors
SERVERSIDE = "serverside"
IDENTITY = "identity"
SEQUENCE = "sequence"
+ MONOTONIC_FUNCTION = "monotonic"
class _SentinelColumnCharacterization(NamedTuple):
AUTOINCREMENT = 2
IDENTITY = 4
SEQUENCE = 8
+ MONOTONIC_FUNCTION = 16
- ANY_AUTOINCREMENT = AUTOINCREMENT | IDENTITY | SEQUENCE
+ ANY_AUTOINCREMENT = (
+ AUTOINCREMENT | IDENTITY | SEQUENCE | MONOTONIC_FUNCTION
+ )
_SUPPORTED_OR_NOT = NOT_SUPPORTED | ANY_AUTOINCREMENT
- USE_INSERT_FROM_SELECT = 16
+ USE_INSERT_FROM_SELECT = 32
RENDER_SELECT_COL_CASTS = 64
_SentinelDefaultCharacterization.SEQUENCE: (
InsertmanyvaluesSentinelOpts.SEQUENCE
),
+ _SentinelDefaultCharacterization.MONOTONIC_FUNCTION: (
+ InsertmanyvaluesSentinelOpts.MONOTONIC_FUNCTION
+ ),
}
)
_sentinel_col_autoinc_lookup = _sentinel_col_non_autoinc_lookup.union(
packagenames: Tuple[str, ...] = ()
+ monotonic: bool = False
+
_has_args = False
_with_ordinality = False
_table_value_type: Optional[TableValueType] = None
*clauses: _ColumnExpressionOrLiteralArgument[_T],
type_: None = ...,
packagenames: Optional[Tuple[str, ...]] = ...,
+ monotonic: bool = ...,
) -> None: ...
@overload
*clauses: _ColumnExpressionOrLiteralArgument[Any],
type_: _TypeEngineArgument[_T] = ...,
packagenames: Optional[Tuple[str, ...]] = ...,
+ monotonic: bool = ...,
) -> None: ...
def __init__(
*clauses: _ColumnExpressionOrLiteralArgument[Any],
type_: Optional[_TypeEngineArgument[_T]] = None,
packagenames: Optional[Tuple[str, ...]] = None,
+ monotonic: bool = False,
) -> None:
"""Construct a :class:`.Function`.
"""
self.packagenames = packagenames or ()
self.name = name
+ self.monotonic = monotonic
# if type is None, we get NULLTYPE, which is our _T. But I don't
# know how to get the overloads to express that correctly
default_characterization = (
_SentinelDefaultCharacterization.SENTINEL_DEFAULT
)
+ elif the_sentinel_zero.default._is_monotonic_fn:
+ default_characterization = (
+ _SentinelDefaultCharacterization.MONOTONIC_FUNCTION
+ )
elif default_is_sequence(the_sentinel_zero.default):
if the_sentinel_zero.default._increment_is_negative:
if sentinel_is_explicit:
)
elif the_sentinel_zero.server_default is not None:
if sentinel_is_explicit:
- raise exc.InvalidRequestError(
- f"Column {the_sentinel[0]} can't be a sentinel column "
- "because it uses an explicit server side default "
- "that's not the Identity() default."
+ if not the_sentinel_zero.server_default._is_monotonic_fn:
+ raise exc.InvalidRequestError(
+ f"Column {the_sentinel[0]} can't be a sentinel "
+ "column "
+ "because it uses an explicit server side default "
+ "that's not the Identity() default."
+ )
+ else:
+ default_characterization = (
+ _SentinelDefaultCharacterization.MONOTONIC_FUNCTION
+ )
+ else:
+ default_characterization = (
+ _SentinelDefaultCharacterization.SERVERSIDE
)
- default_characterization = (
- _SentinelDefaultCharacterization.SERVERSIDE
- )
-
if the_sentinel is None and self.primary_key:
assert autoinc_col is None
# determine for non-autoincrement pk if all elements are
# client side
for _pkc in self.primary_key:
- if _pkc.server_default is not None or (
- _pkc.default and not _pkc.default.is_callable
+ if (
+ _pkc.server_default is not None
+ and not _pkc.server_default._is_monotonic_fn
+ ):
+ break
+
+ if (
+ _pkc.default
+ and not _pkc.default.is_callable
+ and not _pkc.default._is_monotonic_fn
):
break
else:
is_scalar = False
has_arg = False
is_sentinel = False
+ _is_monotonic_fn = False
column: Optional[Column[Any]]
def __init__(self, for_update: bool = False) -> None:
arg: Any
+ _is_monotonic_fn = False
+
@overload
def __new__(
cls, arg: Callable[..., Any], for_update: bool = ...
arg=self.arg, for_update=self.for_update
)
+ @util.memoized_property
+ @util.preload_module("sqlalchemy.sql.functions")
+ def _is_monotonic_fn(self) -> bool:
+ functions = util.preloaded.sql_functions
+ return (
+ isinstance(self.arg, functions.FunctionElement)
+ and self.arg.monotonic
+ )
+
@util.memoized_property
@util.preload_module("sqlalchemy.sql.sqltypes")
def _arg_is_typed(self) -> bool:
has_argument = False
is_clause_element = False
is_identity = False
+ _is_monotonic_fn = False
column: Optional[Column[Any]]
self.arg = arg
self.reflected = _reflected
+ @util.memoized_property
+ @util.preload_module("sqlalchemy.sql.functions")
+ def _is_monotonic_fn(self) -> bool:
+ functions = util.preloaded.sql_functions
+ return (
+ isinstance(self.arg, functions.FunctionElement)
+ and self.arg.monotonic
+ )
+
def _copy(self) -> DefaultClause:
return DefaultClause(
arg=self.arg, for_update=self.for_update, _reflected=self.reflected
import datetime
+from typing import Any
from sqlalchemy import and_
from sqlalchemy import cast
from sqlalchemy.testing import eq_
from sqlalchemy.testing import expect_warnings
from sqlalchemy.testing import fixtures
+from sqlalchemy.testing import mock
from sqlalchemy.testing.assertsql import CursorSQL
from sqlalchemy.testing.assertsql import DialectSQL
metadata.create_all(connection)
self._assert_data_noautoincrement(connection, table)
- def test_full_cursor_insertmanyvalues_sql(self, metadata, connection):
+ @testing.variation(
+ "default_type",
+ [
+ "ss_sequence",
+ ("sd_uuidv7", testing.only_on("postgresql>=18")),
+ ("cd_uuidv7", testing.only_on("postgresql>=18")),
+ ],
+ )
+ @testing.variation("set_sentinel", [True, False])
+ def test_full_cursor_insertmanyvalues_sql(
+ self,
+ metadata,
+ connection,
+ default_type: testing.Variation,
+ set_sentinel: testing.Variation,
+ ):
"""test compilation/ execution of the subquery form including
the fix for #13015
"""
- my_table = Table(
- "my_table",
- metadata,
- Column("data1", String(50)),
- Column(
+ if set_sentinel:
+ col_kw: dict[str, Any] = {"insert_sentinel": True}
+ else:
+ col_kw = {}
+
+ if default_type.ss_sequence:
+ col = Column(
"id",
Integer,
Sequence("foo_id_seq", start=1, data_type=Integer),
primary_key=True,
- ),
+ **col_kw,
+ )
+ elif default_type.sd_uuidv7:
+ col = Column(
+ "id",
+ Uuid(),
+ server_default=func.uuidv7(monotonic=True),
+ primary_key=True,
+ **col_kw,
+ )
+ elif default_type.cd_uuidv7:
+ col = Column(
+ "id",
+ Uuid(),
+ default=func.uuidv7(monotonic=True),
+ primary_key=True,
+ **col_kw,
+ )
+ else:
+ default_type.fail()
+
+ my_table = Table(
+ "my_table",
+ metadata,
+ Column("data1", String(50)),
+ col,
Column("data2", String(50)),
)
my_table.create(connection)
with self.sql_execution_asserter(connection) as assert_:
- connection.execute(
+ result = connection.execute(
my_table.insert().returning(
my_table.c.data1,
my_table.c.id,
],
)
+ rows = result.all()
+ if default_type.ss_sequence:
+ eq_(rows, [(f"d1 row {i}", i + 1) for i in range(10)])
+ else:
+ # monotonic UUIDs are sorted
+ eq_(
+ list(sorted(rows, key=lambda row: row.id)),
+ [(f"d1 row {i}", mock.ANY) for i in range(10)],
+ )
+
render_bind_casts = (
String().dialect_impl(connection.dialect).render_bind_cast
)
else:
assert False
+ if default_type.ss_sequence:
+ sqlfunc_sql = "nextval('foo_id_seq'), "
+ ins_cols = "id, "
+ p2 = "p2"
+ elif default_type.sd_uuidv7:
+ sqlfunc_sql = ""
+ ins_cols = ""
+ p2 = "p1"
+ elif default_type.cd_uuidv7:
+ sqlfunc_sql = "uuidv7(), "
+ ins_cols = "id, "
+ p2 = "p2"
+ else:
+ default_type.fail()
+
assert_.assert_(
CursorSQL(
- "INSERT INTO my_table (data1, id, data2) "
- f"SELECT p0::VARCHAR, nextval('foo_id_seq'), p2::VARCHAR "
+ f"INSERT INTO my_table (data1, {ins_cols}data2) "
+ f"SELECT p0::VARCHAR, {sqlfunc_sql}{p2}::VARCHAR "
f"FROM (VALUES {params}) "
- "AS imp_sen(p0, p2, sen_counter) ORDER BY sen_counter "
+ f"AS imp_sen(p0, {p2}, sen_counter) ORDER BY sen_counter "
"RETURNING my_table.data1, my_table.id, my_table.id AS id__1",
parameters,
)