From: Mike Bayer Date: Wed, 4 Dec 2024 14:22:14 +0000 (-0500) Subject: document name normalize X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=72cf5e4089f3bd1052db89705558bfea304d07dc;p=thirdparty%2Fsqlalchemy%2Fsqlalchemy.git document name normalize Fixes: #10789 Change-Id: I795d92c900502e4b2fde7ab11e8adb9b03d5b782 --- diff --git a/lib/sqlalchemy/dialects/oracle/base.py b/lib/sqlalchemy/dialects/oracle/base.py index ebd13d21a7..3d3ff9d517 100644 --- a/lib/sqlalchemy/dialects/oracle/base.py +++ b/lib/sqlalchemy/dialects/oracle/base.py @@ -165,15 +165,150 @@ Identifier Casing ----------------- In Oracle Database, the data dictionary represents all case insensitive -identifier names using UPPERCASE text. SQLAlchemy on the other hand considers -an all-lower case identifier name to be case insensitive. The Oracle Database -dialects convert all case insensitive identifiers to and from those two formats -during schema level communication, such as reflection of tables and indexes. -Using an UPPERCASE name on the SQLAlchemy side indicates a case sensitive -identifier, and SQLAlchemy will quote the name - this will cause mismatches -against data dictionary data received from Oracle Database, so unless -identifier names have been truly created as case sensitive (i.e. using quoted -names), all lowercase names should be used on the SQLAlchemy side. +identifier names using UPPERCASE text. This is in contradiction to the +expectations of SQLAlchemy, which assume a case insensitive name is represented +as lowercase text. + +As an example of case insensitive identifier names, consider the following table: + +.. sourcecode:: sql + + CREATE TABLE MyTable (Identifier INTEGER PRIMARY KEY) + +If you were to ask Oracle Database for information about this table, the +table name would be reported as ``MYTABLE`` and the column name would +be reported as ``IDENTIFIER``. Compare to most other databases such as +PostgreSQL and MySQL which would report these names as ``mytable`` and +``identifier``. The names are **not quoted, therefore are case insensitive**. +The special casing of ``MyTable`` and ``Identifier`` would only be maintained +if they were quoted in the table definition: + +.. sourcecode:: sql + + CREATE TABLE "MyTable" ("Identifier" INTEGER PRIMARY KEY) + +When constructing a SQLAlchemy :class:`.Table` object, **an all lowercase name +is considered to be case insensitive**. So the following table assumes +case insensitive names:: + + Table("mytable", metadata, Column("identifier", Integer, primary_key=True)) + +Whereas when mixed case or UPPERCASE names are used, case sensitivity is +assumed:: + + Table("MyTable", metadata, Column("Identifier", Integer, primary_key=True)) + +A similar situation occurs at the database driver level when emitting a +textual SQL SELECT statement and looking at column names in the DBAPI +``cursor.description`` attribute. A database like PostgreSQL will normalize +case insensitive names to be lowercase:: + + >>> pg_engine = create_engine("postgresql://scott:tiger@localhost/test") + >>> pg_connection = pg_engine.connect() + >>> result = pg_connection.exec_driver_sql("SELECT 1 AS SomeName") + >>> result.cursor.description + (Column(name='somename', type_code=23),) + +Whereas Oracle normalizes them to UPPERCASE:: + + >>> oracle_engine = create_engine("oracle+oracledb://scott:tiger@oracle18c/xe") + >>> oracle_connection = oracle_engine.connect() + >>> result = oracle_connection.exec_driver_sql( + ... "SELECT 1 AS SomeName FROM DUAL" + ... ) + >>> result.cursor.description + [('SOMENAME', , 127, None, 0, -127, True)] + +In order to achieve cross-database parity for the two cases of a. table +reflection and b. textual-only SQL statement round trips, SQLAlchemy performs a step +called **name normalization** when using the Oracle dialect. This process may +also apply to other third party dialects that have similar UPPERCASE handling +of case insensitive names. + +When using name normalization, SQLAlchemy attempts to detect if a name is +case insensitive by checking if all characters are UPPERCASE letters only; +if so, then it assumes this is a case insensitive name and is delivered as +a lowercase name. + +For table reflection, a tablename that is seen represented as all UPPERCASE +in Oracle Database's catalog tables will be assumed to have a case insensitive +name. This is what allows the ``Table`` definition to use lower case names +and be equally compatible from a reflection point of view on Oracle Database +and all other databases such as PostgreSQL and MySQL:: + + # matches a table created with CREATE TABLE mytable + Table("mytable", metadata, autoload_with=some_engine) + +Above, the all lowercase name ``"mytable"`` is case insensitive; it will match +a table reported by PostgreSQL as ``"mytable"`` and a table reported by +Oracle as ``"MYTABLE"``. If name normalization were not present, it would +not be possible for the above :class:`.Table` definition to be introspectable +in a cross-database way, since we are dealing with a case insensitive name +that is not reported by each database in the same way. + +Case sensitivity can be forced on in this case, such as if we wanted to represent +the quoted tablename ``"MYTABLE"`` with that exact casing, most simply by using +that casing directly, which will be seen as a case sensitive name:: + + # matches a table created with CREATE TABLE "MYTABLE" + Table("MYTABLE", metadata, autoload_with=some_engine) + +For the unusual case of a quoted all-lowercase name, the :class:`.quoted_name` +construct may be used:: + + from sqlalchemy import quoted_name + + # matches a table created with CREATE TABLE "mytable" + Table( + quoted_name("mytable", quote=True), metadata, autoload_with=some_engine + ) + +Name normalization also takes place when handling result sets from **purely +textual SQL strings**, that have no other :class:`.Table` or :class:`.Column` +metadata associated with them. This includes SQL strings executed using +:meth:`.Connection.exec_driver_sql` and SQL strings executed using the +:func:`.text` construct which do not include :class:`.Column` metadata. + +Returning to the Oracle Database SELECT statement, we see that even though +``cursor.description`` reports the column name as ``SOMENAME``, SQLAlchemy +name normalizes this to ``somename``:: + + >>> oracle_engine = create_engine("oracle+oracledb://scott:tiger@oracle18c/xe") + >>> oracle_connection = oracle_engine.connect() + >>> result = oracle_connection.exec_driver_sql( + ... "SELECT 1 AS SomeName FROM DUAL" + ... ) + >>> result.cursor.description + [('SOMENAME', , 127, None, 0, -127, True)] + >>> result.keys() + RMKeyView(['somename']) + +The single scenario where the above behavior produces inaccurate results +is when using an all-uppercase, quoted name. SQLAlchemy has no way to determine +that a particular name in ``cursor.description`` was quoted, and is therefore +case sensitive, or was not quoted, and should be name normalized:: + + >>> result = oracle_connection.exec_driver_sql( + ... 'SELECT 1 AS "SOMENAME" FROM DUAL' + ... ) + >>> result.cursor.description + [('SOMENAME', , 127, None, 0, -127, True)] + >>> result.keys() + RMKeyView(['somename']) + +For this exact scenario, SQLAlchemy offers the :paramref:`.Connection.execution_options.driver_column_names` +execution options, which turns off name normalize for result sets:: + + >>> result = oracle_connection.exec_driver_sql( + ... 'SELECT 1 AS "SOMENAME" FROM DUAL', + ... execution_options={"driver_column_names": True}, + ... ) + >>> result.keys() + RMKeyView(['SOMENAME']) + +.. versionadded:: 2.1 Added the :paramref:`.Connection.execution_options.driver_column_names` + execution option + .. _oracle_max_identifier_lengths: