From: Mike Bayer Date: Mon, 1 Nov 2021 16:06:32 +0000 (-0400) Subject: Revise "literal parameters" FAQ section X-Git-Tag: rel_2_0_0b1~684 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=b2a28c556f10ee31605c978173f0cce62175ad61;p=thirdparty%2Fsqlalchemy%2Fsqlalchemy.git Revise "literal parameters" FAQ section based on feedback in #7271, the emphasis on TypeDecorator as a solution to this problem is not very practical. illustrate a series of quick recipes that are useful for debugging purposes to print out a repr() or simple stringify of a parameter without the need to construct custom dialects or types. Change-Id: I788ce1b5ea01d88dd0a22d03d06f35aabff5e5c8 --- diff --git a/doc/build/faq/sqlexpressions.rst b/doc/build/faq/sqlexpressions.rst index 93653a10ce..cc629f4cc0 100644 --- a/doc/build/faq/sqlexpressions.rst +++ b/doc/build/faq/sqlexpressions.rst @@ -61,6 +61,13 @@ use a PostgreSQL dialect:: from sqlalchemy.dialects import postgresql print(statement.compile(dialect=postgresql.dialect())) +Note that any dialect can be assembled using :func:`_sa.create_engine` itself +with a dummy URL and then accessing the :attr:`_engine.Engine.dialect` attribute, +such as if we wanted a dialect object for psycopg2:: + + e = create_engine("postgresql+psycopg2://") + psycopg2_dialect = e.dialect + When given an ORM :class:`~.orm.query.Query` object, in order to get at the :meth:`_expression.ClauseElement.compile` method we only need access the :attr:`~.orm.query.Query.statement` @@ -72,7 +79,7 @@ accessor first:: Rendering Bound Parameters Inline ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -.. warning:: **Never** use this technique with string content received from +.. warning:: **Never** use these techniques with string content received from untrusted input, such as from web forms or other user-input applications. SQLAlchemy's facilities to coerce Python values into direct SQL string values are **not secure against untrusted input and do not validate the type @@ -98,44 +105,185 @@ flag, passed to ``compile_kwargs``:: # **do not use** with untrusted input!!! print(s.compile(compile_kwargs={"literal_binds": True})) -The above approach has the caveats that it is only supported for basic -types, such as ints and strings, and furthermore if a :func:`.bindparam` -without a pre-set value is used directly, it won't be able to -stringify that either. + # to render for a specific dialect + print(s.compile(dialect=dialect, compile_kwargs={"literal_binds": True})) -This functionality is provided mainly for -logging or debugging purposes, where having the raw sql string of a query -may prove useful. Note that the ``dialect`` parameter should also -passed to the :meth:`_expression.ClauseElement.compile` method to render -the query that will be sent to the database. + # or if you have an Engine, pass as first argument + print(s.compile(some_engine, compile_kwargs={"literal_binds": True})) -To support inline literal rendering for types not supported, implement -a :class:`.TypeDecorator` for the target type which includes a -:meth:`.TypeDecorator.process_literal_param` method:: +This functionality is provided mainly for logging or debugging purposes, where +having the raw sql string of a query may prove useful. - from sqlalchemy import TypeDecorator, Integer +The above approach has the caveats that it is only supported for basic types, +such as ints and strings, and furthermore if a :func:`.bindparam` without a +pre-set value is used directly, it won't be able to stringify that either. +Methods of stringifying all parameters unconditionally are detailed below. +.. tip:: - class MyFancyType(TypeDecorator): - impl = Integer + The reason SQLAlchemy does not support full stringification of all + datatypes is threefold: - def process_literal_param(self, value, dialect): - return "my_fancy_formatting(%s)" % value + 1. This is a functionality that is already supported by the DBAPI in use + when the DBAPI is used normally. The SQLAlchemy project cannot be + tasked with duplicating this functionality for every datatype for + all backends, as this is redundant work which also incurs significant + testing and ongoing support overhead. + + 2. Stringifying with bound parameters inlined for specific databases + suggests a usage that is actually passing these fully stringified + statements onto the database for execution. This is unnecessary and + insecure, and SQLAlchemy does not want to encourage this use in any + way. + + 3. The area of rendering literal values is the most likely area for + security issues to be reported. SQLAlchemy tries to keep the area of + safe parameter stringification an issue for the DBAPI drivers as much + as possible where the specifics for each DBAPI can be handled + appropriately and securely. + +As SQLAlchemy intentionally does not support full stringification of literal +values, techniques to do so within specific debugging scenarios include the +following. As an example, we will use the PostgreSQL :class:`_postgresql.UUID` +datatype:: + + import uuid + + from sqlalchemy import Column + from sqlalchemy import create_engine + from sqlalchemy import Integer + from sqlalchemy import select + from sqlalchemy.dialects.postgresql import UUID + from sqlalchemy.orm import declarative_base + + + Base = declarative_base() + + class A(Base): + __tablename__ = 'a' + + id = Column(Integer, primary_key=True) + data = Column(UUID) + + stmt = select(A).where(A.data == uuid.uuid4()) + +Given the above model and statement which will compare a column to a single +UUID value, options for stringifying this statement with inline values +include: + +* Some DBAPIs such as psycopg2 support helper functions like + `mogrify() `_ which + provide access to their literal-rendering functionality. To use such + features, render the SQL string without using ``literal_binds`` and pass + the parameters separately via the :attr:`.SQLCompiler.params` accessor:: + + e = create_engine("postgresql+psycopg2://scott:tiger@localhost/test") + + with e.connect() as conn: + cursor = conn.connection.cursor() + compiled = stmt.compile(e) + + print(cursor.mogrify(str(compiled), compiled.params)) + + The above code will produce psycopg2's raw bytestring:: + + b"SELECT a.id, a.data \nFROM a \nWHERE a.data = 'a511b0fc-76da-4c47-a4b4-716a8189b7ac'::uuid" + +* Render the :attr:`.SQLCompiler.params` directly into the statement, using + the appropriate `paramstyle `_ + of the target DBAPI. For example, the psycopg2 DBAPI uses the named ``pyformat`` + style. The meaning of ``render_postcompile`` will be discussed in the next + section. **WARNING this is NOT secure, do NOT use untrusted input**:: + + e = create_engine("postgresql+psycopg2://") + + # will use pyformat style, i.e. %(paramname)s for param + compiled = stmt.compile(e, compile_kwargs={"render_postcompile": True}) + + print(str(compiled) % compiled.params) + + This will produce a non-working string, that nonetheless is suitable for + debugging:: + + SELECT a.id, a.data + FROM a + WHERE a.data = 9eec1209-50b4-4253-b74b-f82461ed80c1 - from sqlalchemy import Table, Column, MetaData + Another example using a positional paramstyle such as ``qmark``, we can render + our above statement in terms of SQLite by also using the + :attr:`.SQLCompiler.positiontup` collection in conjunction with + :attr:`.SQLCompiler.params`, in order to retrieve the parameters in + their positional order for the statement as compiled:: - tab = Table('mytable', MetaData(), Column('x', MyFancyType())) + import re + e = create_engine("sqlite+pysqlite://") - stmt = tab.select().where(tab.c.x > 5) - print(stmt.compile(compile_kwargs={"literal_binds": True})) + # will use qmark style, i.e. ? for param + compiled = stmt.compile(e, compile_kwargs={"render_postcompile": True}) -producing output like:: + # params in positional order + params = (repr(compiled.params[name]) for name in compiled.positiontup) - SELECT mytable.x - FROM mytable - WHERE mytable.x > my_fancy_formatting(5) + print(re.sub(r'\?', lambda m: next(params), str(compiled))) + The above snippet prints:: + SELECT a.id, a.data + FROM a + WHERE a.data = UUID('1bd70375-db17-4d8c-94f1-fc2ef3aada26') + +* Use the :ref:`sqlalchemy.ext.compiler_toplevel` extension to render + :class:`_sql.BindParameter` objects in a custom way when a user-defined + flag is present. This flag is sent through the ``compile_kwargs`` + dictionary like any other flag:: + + from sqlalchemy.ext.compiler import compiles + from sqlalchemy.sql.expression import BindParameter + + @compiles(BindParameter) + def _render_literal_bindparam(element, compiler, use_my_literal_recipe=False, **kw): + if not use_my_literal_recipe: + # use normal bindparam processing + return compiler.visit_bindparam(element, **kw) + + # if use_my_literal_recipe was passed to compiler_kwargs, + # render the value directly + return repr(element.value) + + e = create_engine("postgresql+psycopg2://") + print(stmt.compile(e, compile_kwargs={"use_my_literal_recipe": True})) + + The above recipe will print:: + + SELECT a.id, a.data + FROM a + WHERE a.data = UUID('47b154cd-36b2-42ae-9718-888629ab9857') + +* For type-specific stringification that's built into a model or a statement, the + :class:`_types.TypeDecorator` class may be used to provide custom stringification + of any datatype using the :meth:`.TypeDecorator.process_literal_param` method:: + + from sqlalchemy import TypeDecorator + + class UUIDStringify(TypeDecorator): + impl = UUID + + def process_literal_param(self, value, dialect): + return repr(value) + + The above datatype needs to be used either explicitly within the model + or locally within the statement using :func:`_sql.type_coerce`, such as :: + + from sqlalchemy import type_coerce + stmt = select(A).where(type_coerce(A.data, UUIDStringify) == uuid.uuid4()) + + print(stmt.compile(e, compile_kwargs={"literal_binds": True})) + + Again printing the same form:: + + SELECT a.id, a.data + FROM a + WHERE a.data = UUID('47b154cd-36b2-42ae-9718-888629ab9857') Rendering "POSTCOMPILE" Parameters as Bound Parameters ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ @@ -149,23 +297,57 @@ known values are passed. "Expanding" parameters are used for string can be safely cached independently of the actual lists of values being passed to a particular invocation of :meth:`_sql.ColumnOperators.in_`:: - >>> from sqlalchemy import column - >>> expr = column('x').in_([1, 2, 3]) - >>> print(expr) - x IN ([POSTCOMPILE_x_1]) + >>> stmt = select(A).where(A.id.in_[1, 2, 3]) To render the IN clause with real bound parameter symbols, use the ``render_postcompile=True`` flag with :meth:`_sql.ClauseElement.compile`:: - >>> print(expr.compile(compile_kwargs={"render_postcompile": True})) - x IN (:x_1_1, :x_1_2, :x_1_3) + >>> e = create_engine("postgresql+psycopg2://") + >>> print(stmt.compile(e, compile_kwargs={"render_postcompile": True})) + SELECT a.id, a.data + FROM a + WHERE a.id IN (%(id_1_1)s, %(id_1_2)s, %(id_1_3)s) + +The ``literal_binds`` flag, described in the previous section regarding +rendering of bound parameters, automatically sets ``render_postcompile`` to +True, so for a statement with simple ints/strings, these can be stringified +directly:: + + # render_postcompile is implied by literal_binds + >>> print(stmt.compile(e, compile_kwargs={"literal_binds": True})) + SELECT a.id, a.data + FROM a + WHERE a.id IN (1, 2, 3) + +The :attr:`.SQLCompiler.params` and :attr:`.SQLCompiler.positiontup` are +also compatible with ``render_postcompile``, so that +the previous recipes for rendering inline bound parameters will work here +in the same way, such as SQLite's positional form:: + + >>> u1, u2, u3 = uuid.uuid4(), uuid.uuid4(), uuid.uuid4() + >>> stmt = select(A).where(A.data.in_([u1, u2, u3])) + + >>> import re + >>> e = create_engine("sqlite+pysqlite://") + >>> compiled = stmt.compile(e, compile_kwargs={"render_postcompile": True}) + >>> params = (repr(compiled.params[name]) for name in compiled.positiontup) + >>> print(re.sub(r'\?', lambda m: next(params), str(compiled))) + SELECT a.id, a.data + FROM a + WHERE a.data IN (UUID('aa1944d6-9a5a-45d5-b8da-0ba1ef0a4f38'), UUID('a81920e6-15e2-4392-8a3c-d775ffa9ccd2'), UUID('b5574cdb-ff9b-49a3-be52-dbc89f087bfa')) + +.. warning:: + + Remember, all of the above code recipes are **only to be used when**: + + 1. the use is **debugging purposes only** -As described in the previous section, the ``literal_binds`` flag works here -by automatically setting ``render_postcompile`` to True:: + 2. the string **is not to be passed to a live production database** - >>> print(expr.compile(compile_kwargs={"literal_binds": True})) - x IN (1, 2, 3) + 3. only with **local, trusted input** + The above recipes for stringification of parameters are **not secure in + any way and should never be used against production databases**. .. _faq_sql_expression_percent_signs: diff --git a/lib/sqlalchemy/sql/compiler.py b/lib/sqlalchemy/sql/compiler.py index 5c3fbb2b1f..7a2333d91d 100644 --- a/lib/sqlalchemy/sql/compiler.py +++ b/lib/sqlalchemy/sql/compiler.py @@ -670,6 +670,21 @@ class SQLCompiler(Compiled): """ + positiontup = None + """for a compiled construct that uses a positional paramstyle, will be + a sequence of strings, indicating the names of bound parameters in order. + + This is used in order to render bound parameters in their correct order, + and is combined with the :attr:`_sql.Compiled.params` dictionary to + render parameters. + + .. seealso:: + + :ref:`faq_sql_expression_string` - includes a usage example for + debugging use cases. + + """ + inline = False def __init__( @@ -1091,7 +1106,14 @@ class SQLCompiler(Compiled): @property def params(self): """Return the bind param dictionary embedded into this - compiled object, for those values that are present.""" + compiled object, for those values that are present. + + .. seealso:: + + :ref:`faq_sql_expression_string` - includes a usage example for + debugging use cases. + + """ return self.construct_params(_check=False) def _process_parameters_for_postcompile(