--- /dev/null
+.. change::
+ :tags: bug, mssql
+ :tickets: 9603
+
+ Due to a critical bug identified in SQL Server, the SQLAlchemy
+ "insertmanyvalues" feature which allows fast INSERT of many rows while also
+ supporting RETURNING unfortunately needs to be disabled for SQL Server. SQL
+ Server is apparently unable to guarantee that the order of rows inserted
+ matches the order in which they are sent back by OUTPUT inserted when
+ table-valued rows are used with INSERT in conjunction with OUTPUT inserted.
+ We are trying to see if Microsoft is able to confirm this undocumented
+ behavior however there is no known workaround, other than it's not safe to
+ use table-valued expressions with OUTPUT inserted for now.
+
The dramatic performance improvement introduced in the 1.4 series and described
at :ref:`change_5263` has now been generalized to all included backends that
support RETURNING, which is all backends other than MySQL: SQLite, MariaDB,
-PostgreSQL (all drivers), Oracle, and SQL Server. While the original feature
+PostgreSQL (all drivers), and Oracle; SQL Server has support but unfortunately
+had to be turned off due to an issue with SQL Server [#]_. While the original feature
was most critical for the psycopg2 driver which otherwise had major performance
issues when using ``cursor.executemany()``, the change is also critical for
other PostgreSQL drivers such as asyncpg, as when using RETURNING,
MySQL, given that MariaDB supports it), the new change generalizes the psycopg2
"fast execution helper" approach to all dialects that support RETURNING, which
now includes SQlite and MariaDB, and for which no other approach for
-"executemany plus RETURNING" is possible, which includes SQLite, MariaDB, all
-PG drivers, and SQL Server. The cx_Oracle and oracledb drivers used for Oracle
+"executemany plus RETURNING" is possible, which includes SQLite, MariaDB, and all
+PG drivers. The cx_Oracle and oracledb drivers used for Oracle
support RETURNING with executemany natively, and this has also been implemented
-to provide equivalent performance improvements. With SQLite and MariaDB now
+to provide equivalent performance improvements. With SQLite and MariaDB now
offering RETURNING support, ORM use of ``cursor.lastrowid`` is nearly a thing
of the past, with only MySQL still relying upon it.
For INSERT statements that don't use RETURNING, traditional executemany()
-behavior is used for most backends, with the current exceptions of psycopg2
-and mssql+pyodbc, which both have very slow executemany() performance overall
+behavior is used for most backends, with the current exception of psycopg2,
+which has very slow executemany() performance overall
and are still improved by the "insertmanyvalues" approach.
Benchmarks
postgresql+asyncpg (network) 88.292285 4.561492
postgresql+psycopg (network) N/A (psycopg3) 4.861368
oracle+cx_Oracle (network) 92.603953 4.809520
-mssql+pyodbc (network) 158.396667 4.825139
mariadb+mysqldb (network) 71.705197 4.075377
============================ ==================== ====================
+.. mssql+pyodbc (network) .. 158.396667 .. 4.825139
+
+
+.. note::
+
+ .. [#] The feature is disabled for SQL Server as of SQLAlchemy 2.0.9 due
+ to incompatibilities in how table-valued expressions are handled by
+ SQL Server. See https://github.com/sqlalchemy/sqlalchemy/issues/9603
+
Two additional drivers have no change in performance; the psycopg2 drivers,
for which fast executemany was already implemented in SQLAlchemy 1.4,
and MySQL, which continues to not offer RETURNING support:
* SQLite - supported for SQLite versions 3.35 and above
* PostgreSQL - all supported Postgresql versions (9 and above)
-* SQL Server - all supported SQL Server versions
+* SQL Server - **disabled by default as of SQLAlchemy 2.0.9** - the SQL syntax
+ used has been shown to not be safe for RETURNING
+ (see https://github.com/sqlalchemy/sqlalchemy/issues/9603)
* MariaDB - supported for MariaDB versions 10.5 and above
* MySQL - no support, no RETURNING feature is present
* Oracle - supports RETURNING with executemany using native cx_Oracle / OracleDB
)
The reason one might want to disable RETURNING for a specific table is to
-work around backend-specific limitations. For example, there is a known
-limitation of SQL Server that the ``OUTPUT inserted.<colname>`` feature
-may not work correctly for a table that has INSERT triggers established;
-such a table may need to include ``implicit_returning=False`` (see
-:ref:`mssql_triggers`).
+work around backend-specific limitations.
.. _engine_insertmanyvalues_page_size:
healthy distance away from PostgreSQL's limit of 32767 and SQLite's modern
limit of 32766, while leaving room for additional parameters in the statement
as well as for DBAPI quirkiness). Older versions of SQLite (prior to 3.32.0)
-will set this value to 999; SQL Server sets it to 2099. MariaDB has no
-established limit however 32700 remains as a limiting factor for SQL message
-size.
+will set this value to 999. MariaDB has no established limit however 32700
+remains as a limiting factor for SQL message size.
The value of the "batch size" can be affected :class:`_engine.Engine`
wide via the :paramref:`_sa.create_engine.insertmanyvalues_page_size` parameter.
INSERT INTO t (x) OUTPUT inserted.id VALUES (?)
- As of SQLAlchemy 2.0, the :ref:`engine_insertmanyvalues` feature is also
- used by default to optimize many-row INSERT statements; for SQL Server
- the feature takes place for both RETURNING and-non RETURNING
- INSERT statements.
-
-* The value of :paramref:`_sa.create_engine.insertmanyvalues_page_size`
- defaults to 1000, however the ultimate page size for a particular INSERT
- statement may be limited further, based on an observed limit of
- 2100 bound parameters for a single statement in SQL Server.
- The page size may also be modified on a per-engine
- or per-statement basis; see the section
- :ref:`engine_insertmanyvalues_page_size` for details.
+ .. note:: SQLAlchemy 2.0 introduced the :ref:`engine_insertmanyvalues`
+ feature for SQL Server, which is used by default to optimize many-row
+ INSERT statements; however as of SQLAlchemy 2.0.9 this feature had
+ to be turned off for SQL Server as the database does not support
+ deterministic RETURNING of INSERT rows for a multi-row INSERT statement.
* When RETURNING is not available or has been disabled via
``implicit_returning=False``, either the ``scope_identity()`` function or
# may be changed at server inspection time for older SQL server versions
supports_multivalues_insert = True
- use_insertmanyvalues = True
+ # disabled due to #9603
+ use_insertmanyvalues = False
# note pyodbc will set this to False if fast_executemany is set,
# as of SQLAlchemy 2.0.9
super().__init__(**opts)
+ if self.use_insertmanyvalues:
+ raise exc.ArgumentError(
+ "The use_insertmanyvalues feature on SQL Server is currently "
+ "not safe to use, as returned result rows may be returned in "
+ "random order. Ensure use_insertmanyvalues is left at its "
+ "default of False (this setting changed in SQLAlchemy 2.0.9)"
+ )
+
self._json_serializer = json_serializer
self._json_deserializer = json_deserializer
Fast Executemany Mode
---------------------
-.. note:: SQLAlchemy 2.0 now includes an equivalent "fast executemany"
- handler for INSERT statements that is more robust than the PyODBC feature
- (but is not quite as performant particularly for very large datasets);
- the feature is called :ref:`insertmanyvalues <engine_insertmanyvalues>`
- and is enabled for all INSERT statements by default.
- SQLAlchemy's feature integrates with the PyODBC ``setinputsizes()`` method
- which allows for more accurate specification of datatypes, and additionally
- uses a dynamically sized, batched approach that scales to any number of
- columns and/or rows.
-
- The SQL Server ``fast_executemany`` parameter may be used at the same time
- as ``insertmanyvalues`` is enabled; however, the parameter will not be used
- for INSERT statements that include RETURNING.
+ .. note:: SQLAlchemy 2.0 introduced the :ref:`engine_insertmanyvalues`
+ feature for SQL Server, which is used by default to optimize many-row
+ INSERT statements; however as of SQLAlchemy 2.0.9 this feature had
+ to be turned off for SQL Server as the database does not support
+ deterministic RETURNING of INSERT rows for a multi-row INSERT statement.
.. versionchanged:: 2.0.9 - ``fast_executemany`` executions will be used
for INSERT statements that don't include RETURNING, when
``fast_executemany`` is set. Previously, ``use_insertmanyvalues`` would
cause ``fast_executemany`` to not be used in most cases.
+ ``use_insertmanyvalues`` is disabled for SQL Server overall as of 2.0.9.
+
The PyODBC driver includes support for a "fast executemany" mode of execution
which greatly reduces round trips for a DBAPI ``executemany()`` call when using
Microsoft ODBC drivers, for **limited size batches that fit in memory**. The
from unittest.mock import Mock
from sqlalchemy import Column
+from sqlalchemy import create_engine
from sqlalchemy import event
from sqlalchemy import exc
from sqlalchemy import inspect
@testing.variation("add_event", [True, False])
@testing.variation("setinputsizes", [True, False])
@testing.variation("fastexecutemany", [True, False])
- @testing.variation("insertmanyvalues", [True, False])
+ @testing.variation("insertmanyvalues", [False]) # disabled due to #9603
@testing.variation("broken_types", [True, False])
def test_insert_typing(
self,
__only_on__ = "mssql"
__backend__ = True
+ def test_no_insertmanyvalues(self):
+ with expect_raises_message(
+ exc.ArgumentError,
+ "The use_insertmanyvalues feature on SQL Server is "
+ "currently not safe to use",
+ ):
+ create_engine("mssql+pyodbc://", use_insertmanyvalues=True)
+
@testing.variation("enable_comments", [True, False])
def test_comments_enabled_disabled(
self, testing_engine, metadata, enable_comments
from sqlalchemy import testing
from sqlalchemy.dialects.mssql import base as mssql
from sqlalchemy.dialects.mssql import pyodbc as mssql_pyodbc
+from sqlalchemy.orm import Session
from sqlalchemy.testing import AssertsCompiledSQL
from sqlalchemy.testing import config
from sqlalchemy.testing import engines
__only_on__ = "mssql"
__backend__ = True
+ def test_broken_table_values_test(self, decl_base, connection):
+ """test #9603.
+
+ this uses the ORM to ensure the ORM is not using any kind of
+ insertmany that causes the problem.
+
+ """
+
+ class Datum(decl_base):
+
+ __tablename__ = "datum"
+
+ id = Column(Integer, autoincrement=False, primary_key=True)
+ data = Column(String(10))
+
+ class Result(decl_base):
+
+ __tablename__ = "result"
+
+ id = Column(Integer, primary_key=True)
+
+ thing = Column(Integer)
+ lft_datum_id = Column(Integer, ForeignKey(Datum.id))
+
+ decl_base.metadata.create_all(connection)
+ with Session(connection) as sess:
+
+ size = 15
+ datum_ids = list(range(1, size + 1))
+
+ sess.add_all([Datum(id=id_, data=f"d{id_}") for id_ in datum_ids])
+ sess.flush()
+
+ result_data = [
+ Result(thing=num, lft_datum_id=datum_ids[num % size])
+ for num in range(size * size)
+ ]
+ sess.add_all(result_data)
+ sess.flush()
+
+ # this is what we expected we put in
+ the_data_in_order_should_be = [
+ (num + 1, num, datum_ids[num % size])
+ for num in range(size * size)
+ ]
+
+ # and yes, that's what went in
+ eq_(
+ sess.execute(
+ select(
+ Result.id, Result.thing, Result.lft_datum_id
+ ).order_by(Result.id)
+ ).all(),
+ the_data_in_order_should_be,
+ )
+
+ # however, if insertmanyvalues is turned on, OUTPUT inserted
+ # did not give us the rows in the order we sent, so ids were
+ # mis-applied. even if we sort the original records by the
+ # ids that were given
+ eq_(
+ [
+ (r.id, r.thing, r.lft_datum_id)
+ for r in sorted(result_data, key=lambda r: r.id)
+ ],
+ the_data_in_order_should_be,
+ )
+
def test_fetchid_trigger(self, metadata, connection):
# TODO: investigate test hang on mssql when connection fixture is used
"""
id_="ia",
)
@testing.combinations(
- ("insertmany", True),
+ # disabled due to #9603
+ # ("insertmany", True),
("insertsingle", False),
argnames="insertmany",
id_="ia",
Conditional(
expect_returning,
[
- CompiledSQL(
- "INSERT INTO test (id, bar) VALUES (:id, :bar) "
- "RETURNING test.foo",
+ Conditional(
+ connection.dialect.insert_executemany_returning,
[
- {"id": 1, "bar": 6},
- {"id": 2, "bar": 6},
+ CompiledSQL(
+ "INSERT INTO test (id, bar) "
+ "VALUES (:id, :bar) "
+ "RETURNING test.foo",
+ [
+ {"id": 1, "bar": 6},
+ {"id": 2, "bar": 6},
+ ],
+ )
],
- )
+ [
+ CompiledSQL(
+ "INSERT INTO test (id, bar) "
+ "VALUES (:id, :bar) "
+ "RETURNING test.foo",
+ {"id": 1, "bar": 6},
+ ),
+ CompiledSQL(
+ "INSERT INTO test (id, bar) "
+ "VALUES (:id, :bar) "
+ "RETURNING test.foo",
+ {"id": 2, "bar": 6},
+ ),
+ ],
+ ),
],
[
CompiledSQL(