From: Mike Bayer Date: Wed, 6 May 2020 13:47:26 +0000 (-0400) Subject: Clarify create_engine encoding; update cx_Oracle X-Git-Tag: rel_1_3_17~10 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=2c4a775022d9e239e432610a580f78528cb1e27b;p=thirdparty%2Fsqlalchemy%2Fsqlalchemy.git Clarify create_engine encoding; update cx_Oracle The create_engine()->encoding parameter is mostly irrelevant under Python 3. make it clear this parameter is not generally useful anymore and refer readers to the dialect documenation. Update cx_Oracle documentation to feature many examples of the encoding / nencoding parameters, remove extra detail that is not generally useful and reorganize information into more specific sections, de-emphasizing legacy / Python 2 specific sections. Change-Id: I42dafb85de0a585c9e05e1e3d787d8d6bfa632c0 (cherry picked from commit 3ab5e3781b9f44d458cb6213e89f50d01ea167dd) --- diff --git a/lib/sqlalchemy/dialects/oracle/base.py b/lib/sqlalchemy/dialects/oracle/base.py index 5963a2923d..05d01b27be 100644 --- a/lib/sqlalchemy/dialects/oracle/base.py +++ b/lib/sqlalchemy/dialects/oracle/base.py @@ -12,21 +12,6 @@ r""" Oracle version 8 through current (11g at the time of this writing) are supported. -Connect Arguments ------------------ - -The dialect supports several :func:`~sqlalchemy.create_engine()` arguments -which affect the behavior of the dialect regardless of driver in use. - -* ``use_ansi`` - Use ANSI JOIN constructs (see the section on Oracle 8). - Defaults to ``True``. If ``False``, Oracle-8 compatible constructs are used - for joins. - -* ``optimize_limits`` - defaults to ``False``. see the section on - LIMIT/OFFSET. - -* ``use_binds_for_limits`` - defaults to ``True``. see the section on - LIMIT/OFFSET. Auto Increment Behavior ----------------------- diff --git a/lib/sqlalchemy/dialects/oracle/cx_oracle.py b/lib/sqlalchemy/dialects/oracle/cx_oracle.py index 2fa9a30190..586fe7c97f 100644 --- a/lib/sqlalchemy/dialects/oracle/cx_oracle.py +++ b/lib/sqlalchemy/dialects/oracle/cx_oracle.py @@ -11,16 +11,23 @@ r""" :connectstring: oracle+cx_oracle://user:pass@host:port/dbname[?key=value&key=value...] :url: https://oracle.github.io/python-cx_Oracle/ -Additional Connect Arguments ----------------------------- +DSN vs. Hostname connections +----------------------------- -When connecting with the ``dbname`` URL token present, the ``hostname``, -``port``, and ``dbname`` tokens are converted to a TNS name using the -``cx_Oracle.makedsn()`` function. The URL below:: +The dialect will connect to a DSN if no database name portion is presented, +such as:: + + engine = create_engine("oracle+cx_oracle://scott:tiger@oracle1120/?encoding=UTF-8&nencoding=UTF-8") + +Above, ``oracle1120`` is passed to cx_Oracle as an Oracle datasource name. - e = create_engine("oracle+cx_oracle://user:pass@hostname/dbname") +Alternatively, if a database name is present, the ``cx_Oracle.makedsn()`` +function is used to create an ad-hoc "datasource" name assuming host +and port:: -Will be used to create the DSN as follows:: + engine = create_engine("oracle+cx_oracle://scott:tiger@hostname:1521/dbname?encoding=UTF-8&nencoding=UTF-8") + +Above, the DSN would be created as follows:: >>> import cx_Oracle >>> cx_Oracle.makedsn("hostname", 1521, sid="dbname") @@ -29,39 +36,44 @@ Will be used to create the DSN as follows:: The ``service_name`` parameter, also consumed by ``cx_Oracle.makedsn()``, may be specified in the URL query string, e.g. ``?service_name=my_service``. -If ``dbname`` is not present, then the value of ``hostname`` in the -URL is used directly as the DSN passed to ``cx_Oracle.connect()``. -Additional connection arguments may be sent to the ``cx_Oracle.connect()`` -function using the :paramref:`_sa.create_engine.connect_args` dictionary. +Passing cx_Oracle connect arguments +----------------------------------- + +Additional connection arguments can usually be passed via the URL +query string; particular symbols like ``cx_Oracle.SYSDBA`` are intercepted +and converted to the correct symbol:: + + e = create_engine( + "oracle+cx_oracle://user:pass@dsn?encoding=UTF-8&nencoding=UTF-8&mode=SYSDBA&events=true") + +.. versionchanged:: 1.3 the cx_oracle dialect now accepts all argument names + within the URL string itself, to be passed to the cx_Oracle DBAPI. As + was the case earlier but not correctly documented, the + :paramref:`_sa.create_engine.connect_args` parameter also accepts all + cx_Oracle DBAPI connect arguments. + +To pass arguments directly to ``.connect()`` wihtout using the query +string, use the :paramref:`_sa.create_engine.connect_args` dictionary. Any cx_Oracle parameter value and/or constant may be passed, such as:: import cx_Oracle e = create_engine( "oracle+cx_oracle://user:pass@dsn", connect_args={ + "encoding": "UTF-8", + "nencoding": "UTF-8", "mode": cx_Oracle.SYSDBA, "events": True } ) -Alternatively, most cx_Oracle DBAPI arguments can also be encoded as strings -within the URL, which includes parameters such as ``mode``, ``purity``, -``events``, ``threaded``, and others:: - - e = create_engine( - "oracle+cx_oracle://user:pass@dsn?mode=SYSDBA&events=true") - -.. versionchanged:: 1.3 the cx_oracle dialect now accepts all argument names - within the URL string itself, to be passed to the cx_Oracle DBAPI. As - was the case earlier but not correctly documented, the - :paramref:`_sa.create_engine.connect_args` parameter also accepts all - cx_Oracle DBAPI connect arguments. +Options consumed by the SQLAlchemy cx_Oracle dialect outside of the driver +-------------------------------------------------------------------------- There are also options that are consumed by the SQLAlchemy cx_oracle dialect itself. These options are always passed directly to :func:`_sa.create_engine` -, -such as:: +, such as:: e = create_engine( "oracle+cx_oracle://user:pass@dsn", coerce_to_unicode=False) @@ -86,19 +98,50 @@ The parameters accepted by the cx_oracle dialect are as follows: Unicode ------- -The cx_Oracle DBAPI as of version 5 fully supports Unicode, and has the -ability to return string results as Python Unicode objects natively. +As is the case for all DBAPIs under Python 3, all strings are inherently +Unicode strings. Under Python 2, cx_Oracle also supports Python Unicode +objects directly. In all cases however, the driver requires an explcit +encoding configuration. + +Ensuring the Correct Client Encoding +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The long accepted standard for establishing client encoding for nearly all +Oracle related software is via the `NLS_LANG `_ +environment variable. cx_Oracle like most other Oracle drivers will use +this environment variable as the source of its encoding configuration. The +format of this variable is idiosyncratic; a typical value would be +``AMERICAN_AMERICA.AL32UTF8``. + +The cx_Oracle driver also supports a programmatic alternative which is to +pass the ``encoding`` and ``nencoding`` parameters directly to its +``.connect()`` function. These can be present in the URL as follows:: -Explicit Unicode support is available by using the :class:`.Unicode` datatype -with SQLAlchemy Core expression language, as well as the :class:`.UnicodeText` -datatype. These types correspond to the VARCHAR2 and CLOB Oracle datatypes by + engine = create_engine("oracle+cx_oracle://scott:tiger@oracle1120/?encoding=UTF-8&nencoding=UTF-8") + +For the meaning of the ``encoding`` and ``nencoding`` parameters, please +consult +`Characters Sets and National Language Support (NLS) `_. + +.. seealso:: + + `Characters Sets and National Language Support (NLS) `_ + - in the cx_Oracle documentation. + + +Unicode-specific Column datatypes +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +The Core expression language handles unicode data by use of the :class:`.Unicode` +and :class:`.UnicodeText` +datatypes. These types correspond to the VARCHAR2 and CLOB Oracle datatypes by default. When using these datatypes with Unicode data, it is expected that the Oracle database is configured with a Unicode-aware character set, as well as that the ``NLS_LANG`` environment variable is set appropriately, so that the VARCHAR2 and CLOB datatypes can accommodate the data. In the case that the Oracle database is not configured with a Unicode character -set, the two options are to use the :class:`_oracle.NCHAR` and +set, the two options are to use the :class:`_types.NCHAR` and :class:`_oracle.NCLOB` datatypes explicitly, or to pass the flag ``use_nchar_for_unicode=True`` to :func:`_sa.create_engine`, which will cause the @@ -110,6 +153,9 @@ SQLAlchemy dialect to use NCHAR/NCLOB for the :class:`.Unicode` / unless the ``use_nchar_for_unicode=True`` is passed to the dialect when :func:`_sa.create_engine` is called. +Unicode Coercion of result rows under Python 2 +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + When result sets are fetched that include strings, under Python 3 the cx_Oracle DBAPI returns all strings as Python Unicode objects, since Python 3 only has a Unicode string type. This occurs for data fetched from datatypes such as @@ -127,7 +173,6 @@ VARCHAR2, CHAR, and CLOB, the flag ``coerce_to_unicode=False`` can be passed to and can be set to False to disable the Unicode coercion of strings that are delivered as VARCHAR2/CHAR/CLOB data. - .. _cx_oracle_unicode_encoding_errors: Encoding Errors diff --git a/lib/sqlalchemy/engine/__init__.py b/lib/sqlalchemy/engine/__init__.py index efa333a4a5..594b2251b2 100644 --- a/lib/sqlalchemy/engine/__init__.py +++ b/lib/sqlalchemy/engine/__init__.py @@ -210,25 +210,30 @@ def create_engine(*args, **kwargs): :param encoding: Defaults to ``utf-8``. This is the string encoding used by SQLAlchemy for string encode/decode operations which occur within SQLAlchemy, **outside of - the DBAPI.** Most modern DBAPIs feature some degree of - direct support for Python ``unicode`` objects, - what you see in Python 2 as a string of the form - ``u'some string'``. For those scenarios where the - DBAPI is detected as not supporting a Python ``unicode`` - object, this encoding is used to determine the - source/destination encoding. It is **not used** - for those cases where the DBAPI handles unicode - directly. - - To properly configure a system to accommodate Python - ``unicode`` objects, the DBAPI should be - configured to handle unicode to the greatest - degree as is appropriate - see - the notes on unicode pertaining to the specific - target database in use at :ref:`dialect_toplevel`. + the DBAPIs own encoding facilities.** + + .. note:: The ``encoding`` parameter deals only with in-Python + encoding issues that were prevalent with many DBAPIs under Python + 2. Under Python 3 it is mostly unused. For DBAPIs that require + client encoding configurations, such as those of MySQL and Oracle, + please consult specific :ref:`dialect documentation + ` for details. + + All modern DBAPIs that work in Python 3 necessarily feature direct + support for Python unicode strings. Under Python 2, this was not + always the case. For those scenarios where the DBAPI is detected as + not supporting a Python ``unicode`` object under Python 2, this + encoding is used to determine the source/destination encoding. It is + **not used** for those cases where the DBAPI handles unicode directly. + + To properly configure a system to accommodate Python ``unicode`` + objects, the DBAPI should be configured to handle unicode to the + greatest degree as is appropriate - see the notes on unicode pertaining + to the specific target database in use at :ref:`dialect_toplevel`. Areas where string encoding may need to be accommodated - outside of the DBAPI include zero or more of: + outside of the DBAPI, nearly always under **Python 2 only**, + include zero or more of: * the values passed to bound parameters, corresponding to the :class:`.Unicode` type or the :class:`.String` type @@ -244,13 +249,12 @@ def create_engine(*args, **kwargs): * the string column names retrieved from the DBAPI's ``cursor.description`` attribute. - When using Python 3, the DBAPI is required to support - *all* of the above values as Python ``unicode`` objects, - which in Python 3 are just known as ``str``. In Python 2, - the DBAPI does not specify unicode behavior at all, - so SQLAlchemy must make decisions for each of the above - values on a per-DBAPI basis - implementations are - completely inconsistent in their behavior. + When using Python 3, the DBAPI is required to support all of the above + values as Python ``unicode`` objects, which in Python 3 are just known + as ``str``. In Python 2, the DBAPI does not specify unicode behavior + at all, so SQLAlchemy must make decisions for each of the above values + on a per-DBAPI basis - implementations are completely inconsistent in + their behavior. :param execution_options: Dictionary execution options which will be applied to all connections. See