From: Mike Bayer Date: Sat, 8 Apr 2023 03:31:55 +0000 (-0400) Subject: walk back SQL Server language a bit re: insertmanyvalues X-Git-Tag: rel_2_0_10~30 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=8b19580216371892eba94e1be410ac255b42c4c0;p=thirdparty%2Fsqlalchemy%2Fsqlalchemy.git walk back SQL Server language a bit re: insertmanyvalues In #9618 we both can look to re-enable insertmanyvalues for SQL Server, and also likely *disable* its use for the ORM unit of work specifically, since that's really where the only problem is, and it will likely be for all dialects, not just SQL Server. An approach using sentinel columns will be rolled out for the unit of work use case. Change-Id: I3358e30839491769db95b4ac042a661271df3929 References: #9618 References: #9603 --- diff --git a/doc/build/changelog/changelog_20.rst b/doc/build/changelog/changelog_20.rst index aeb824133f..1319c82939 100644 --- a/doc/build/changelog/changelog_20.rst +++ b/doc/build/changelog/changelog_20.rst @@ -20,15 +20,21 @@ :tags: bug, mssql :tickets: 9603 - Due to a critical bug identified in SQL Server, the SQLAlchemy - "insertmanyvalues" feature which allows fast INSERT of many rows while also - supporting RETURNING unfortunately needs to be disabled for SQL Server. SQL - Server is apparently unable to guarantee that the order of rows inserted - matches the order in which they are sent back by OUTPUT inserted when - table-valued rows are used with INSERT in conjunction with OUTPUT inserted. - We are trying to see if Microsoft is able to confirm this undocumented - behavior however there is no known workaround, other than it's not safe to - use table-valued expressions with OUTPUT inserted for now. + The SQLAlchemy "insertmanyvalues" feature which allows fast INSERT of + many rows while also supporting RETURNING is temporarily disabled for + SQL Server. As the unit of work currently relies upon this feature such + that it matches existing ORM objects to returned primary key + identities, this particular use pattern does not work with SQL Server + in all cases as the order of rows returned by "OUTPUT inserted" may not + always match the order in which the tuples were sent, leading to + the ORM making the wrong decisions about these objects in subsequent + operations. + + The feature will be re-enabled in an upcoming release and will again + take effect for multi-row INSERT statements, however the unit-of-work's + use of the feature will be disabled, possibly for all dialects, unless + ORM-mapped tables also include a "sentinel" column so that the + returned rows can be referenced back to the original data passed in. .. change:: diff --git a/doc/build/changelog/whatsnew_20.rst b/doc/build/changelog/whatsnew_20.rst index 6b023bb483..02bd22bc6c 100644 --- a/doc/build/changelog/whatsnew_20.rst +++ b/doc/build/changelog/whatsnew_20.rst @@ -859,8 +859,8 @@ Optimized ORM bulk insert now implemented for all backends other than MySQL The dramatic performance improvement introduced in the 1.4 series and described at :ref:`change_5263` has now been generalized to all included backends that support RETURNING, which is all backends other than MySQL: SQLite, MariaDB, -PostgreSQL (all drivers), and Oracle; SQL Server has support but unfortunately -had to be turned off due to an issue with SQL Server [#]_. While the original feature +PostgreSQL (all drivers), and Oracle; SQL Server has support but is +temporarily disabled in version 2.0.9 [#]_. While the original feature was most critical for the psycopg2 driver which otherwise had major performance issues when using ``cursor.executemany()``, the change is also critical for other PostgreSQL drivers such as asyncpg, as when using RETURNING, @@ -985,7 +985,10 @@ mariadb+mysqldb (network) 71.705197 4.075377 .. [#] The feature is disabled for SQL Server as of SQLAlchemy 2.0.9 due to incompatibilities in how table-valued expressions are handled by - SQL Server. See https://github.com/sqlalchemy/sqlalchemy/issues/9603 + SQL Server regarding the ORM unit of work. An upcoming release will + re-enable it with unit-of-work oriented adjustments. + See https://github.com/sqlalchemy/sqlalchemy/issues/9603 and + https://github.com/sqlalchemy/sqlalchemy/issues/9618. Two additional drivers have no change in performance; the psycopg2 drivers, for which fast executemany was already implemented in SQLAlchemy 1.4, diff --git a/lib/sqlalchemy/dialects/mssql/base.py b/lib/sqlalchemy/dialects/mssql/base.py index 808fdf16fe..a77cced7e0 100644 --- a/lib/sqlalchemy/dialects/mssql/base.py +++ b/lib/sqlalchemy/dialects/mssql/base.py @@ -253,9 +253,10 @@ The process for fetching this value has several variants: .. note:: SQLAlchemy 2.0 introduced the :ref:`engine_insertmanyvalues` feature for SQL Server, which is used by default to optimize many-row - INSERT statements; however as of SQLAlchemy 2.0.9 this feature had - to be turned off for SQL Server as the database does not support - deterministic RETURNING of INSERT rows for a multi-row INSERT statement. + INSERT statements; however as of SQLAlchemy 2.0.9 this feature is + temporarily disabled for SQL Server, until adjustments can be made + so that the ORM unit of work does not rely upon the ordering of returned + rows. * When RETURNING is not available or has been disabled via ``implicit_returning=False``, either the ``scope_identity()`` function or