flag is enabled for cx_Oracle assuming at least version 5.0, which
is now commonplace.
+ .. change:: 3907
+ :tags: feature, sql
+ :tickets: 3907
+
+ The longstanding behavior of the :meth:`.ColumnOperators.in_` and
+ :meth:`.ColumnOperators.notin_` operators emitting a warning when
+ the right-hand condition is an empty sequence has been revised;
+ a simple "static" expression of "1 != 1" or "1 = 1" is now rendered
+ by default, rather than pulling in the original left-hand
+ expression. This causes the result for a NULL column comparison
+ against an empty set to change from NULL to true/false. The
+ behavior is configurable, and the old behavior can be enabled
+ using the :paramref:`.create_engine.empty_in_strategy` parameter
+ to :func:`.create_engine`.
+
+ .. seealso::
+
+ :ref:`change_3907`
+
.. change:: 3276
:tags: bug, oracle
:tickets: 3276
Key Behavioral Changes - Core
=============================
+.. _change_3907:
+
+The IN / NOT IN operators render a simplified boolean expression with an empty collection
+-----------------------------------------------------------------------------------------
+
+An expression such as ``column.in_([])``, which is assumed to be false,
+now produces the expression ``1 != 1``
+by default, instead of ``column != column``. This will **change the result**
+of a query that is comparing a SQL expression or column that evaluates to
+NULL when compared to an empty set, producing a boolean value false or true
+(for NOT IN) rather than NULL. The warning that would emit under
+this condition is also removed. The old behavior is available using the
+:paramref:`.create_engine.empty_in_strategy` parameter to
+:func:`.create_engine`.
+
+In SQL, the IN and NOT IN operators do not support comparison to a
+collection of values that is explicitly empty; meaning, this syntax is
+illegal::
+
+ mycolumn IN ()
+
+To work around this, SQLAlchemy and other database libraries detect this
+condition and render an alternative expression that evaluates to false, or
+in the case of NOT IN, to true, based on the theory that "col IN ()" is always
+false since nothing is in "the empty set". Typically, in order to
+produce a false/true constant that is portable across databases and works
+in the context of the WHERE clause, a simple tautology such as ``1 != 1`` is
+used to evaluate to false and ``1 = 1`` to evaluate to true (a simple constant
+"0" or "1" often does not work as the target of a WHERE clause).
+
+SQLAlchemy in its early days began with this approach as well, but soon it
+was theorized that the SQL expression ``column IN ()`` would not evaluate to
+false if the "column" were NULL; instead, the expression would produce NULL,
+since "NULL" means "unknown", and comparisons to NULL in SQL usually produce
+NULL.
+
+To simulate this result, SQLAlchemy changed from using ``1 != 1`` to
+instead use th expression ``expr != expr`` for empty "IN" and ``expr = expr``
+for empty "NOT IN"; that is, instead of using a fixed value we use the
+actual left-hand side of the expression. If the left-hand side of
+the expression passed evaluates to NULL, then the comparison overall
+also gets the NULL result instead of false or true.
+
+Unfortunately, users eventually complained that this expression had a very
+severe performance impact on some query planners. At that point, a warning
+was added when an empty IN expression was encountered, favoring that SQLAlchemy
+continues to be "correct" and urging users to avoid code that generates empty
+IN predicates in general, since typically they can be safely omitted. However,
+this is of course burdensome in the case of queries that are built up dynamically
+from input variables, where an incoming set of values might be empty.
+
+In recent months, the original assumptions of this decision have been
+questioned. The notion that the expression "NULL IN ()" should return NULL was
+only theoretical, and could not be tested since databases don't support that
+syntax. However, as it turns out, you can in fact ask a relational database
+what value it would return for "NULL IN ()" by simulating the empty set as
+follows::
+
+ SELECT NULL IN (SELECT 1 WHERE 1 != 1)
+
+With the above test, we see that the databases themselves can't agree on
+the answer. Postgresql, considered by most to be the most "correct" database,
+returns False; because even though "NULL" represents "unknown", the "empty set"
+means nothing is present, including all unknown values. On the
+other hand, MySQL and MariaDB return NULL for the above expression, defaulting
+to the more common behavior of "all comparisons to NULL return NULL".
+
+SQLAlchemy's SQL architecture is more sophisticated than it was when this
+design decision was first made, so we can now allow either behavior to
+be invoked at SQL string compilation time. Previously, the conversion to a
+comparison expression were done at construction time, that is, the moment
+the :meth:`.ColumnOperators.in_` or :meth:`.ColumnOperators.notin_` operators were invoked.
+With the compilation-time behavior, the dialect itself can be instructed
+to invoke either approach, that is, the "static" ``1 != 1`` comparison or the
+"dynamic" ``expr != expr`` comparison. The default has been **changed**
+to be the "static" comparison, since this agrees with the behavior that
+Postgresql would have in any case and this is also what the vast majority
+of users prefer. This will **change the result** of a query that is comparing
+a null expression to the empty set, particularly one that is querying
+for the negation ``where(~null_expr.in_([]))``, since this now evaluates to true
+and not NULL.
+
+The behavior can now be controlled using the flag
+:paramref:`.create_engine.empty_in_strategy`, which defaults to the
+``"static"`` setting, but may also be set to ``"dynamic"`` or
+``"dynamic_warn"``, where the ``"dynamic_warn"`` setting is equivalent to the
+previous behavior of emitting ``expr != expr`` as well as a performance
+warning. However, it is anticipated that most users will appreciate the
+"static" default.
+
+:ticket:`3907`
+
Dialect Improvements and Changes - PostgreSQL
=============================================
"name normalized", that is, expressed as lower case for a case insensitive
name, rather than the raw UPPERCASE format that Oracle uses::
- >>> insp.get_indexes("addresses")
- [{'unique': False, 'column_names': [u'user_id'],
- 'name': u'address_idx', 'dialect_options': {}}]
+ >>> insp.get_indexes("addresses")
+ [{'unique': False, 'column_names': [u'user_id'],
+ 'name': u'address_idx', 'dialect_options': {}}]
- >>> insp.get_pk_constraint("addresses")
- {'name': u'pk_cons', 'constrained_columns': [u'id']}
+ >>> insp.get_pk_constraint("addresses")
+ {'name': u'pk_cons', 'constrained_columns': [u'id']}
- >>> insp.get_foreign_keys("addresses")
- [{'referred_table': u'users', 'referred_columns': [u'id'],
- 'referred_schema': None, 'name': u'user_id_fk',
- 'constrained_columns': [u'user_id']}]
+ >>> insp.get_foreign_keys("addresses")
+ [{'referred_table': u'users', 'referred_columns': [u'id'],
+ 'referred_schema': None, 'name': u'user_id_fk',
+ 'constrained_columns': [u'user_id']}]
Previously, the foreign keys result would look like::
- [{'referred_table': u'users', 'referred_columns': [u'id'],
- 'referred_schema': None, 'name': 'USER_ID_FK',
- 'constrained_columns': [u'user_id']}]
+ [{'referred_table': u'users', 'referred_columns': [u'id'],
+ 'referred_schema': None, 'name': 'USER_ID_FK',
+ 'constrained_columns': [u'user_id']}]
Where the above could create problems particularly with Alembic autogenerate.
WHERE mytable.x > my_fancy_formatting(5)
-Why does ``.col.in_([])`` Produce ``col != col``? Why not ``1=0``?
--------------------------------------------------------------------
-
-A little introduction to the issue. The IN operator in SQL, given a list of
-elements to compare against a column, generally does not accept an empty list,
-that is while it is valid to say::
-
- column IN (1, 2, 3)
-
-it's not valid to say::
-
- column IN ()
-
-SQLAlchemy's :meth:`.Operators.in_` operator, when given an empty list, produces this
-expression::
-
- column != column
-
-As of version 0.6, it also produces a warning stating that a less efficient
-comparison operation will be rendered. This expression is the only one that is
-both database agnostic and produces correct results.
-
-For example, the naive approach of "just evaluate to false, by comparing 1=0
-or 1!=1", does not handle nulls properly. An expression like::
-
- NOT column != column
-
-will not return a row when "column" is null, but an expression which does not
-take the column into account::
-
- NOT 1=0
-
-will.
-
-Closer to the mark is the following CASE expression::
-
- CASE WHEN column IS NOT NULL THEN 1=0 ELSE NULL END
-
-We don't use this expression due to its verbosity, and its also not
-typically accepted by Oracle within a WHERE clause - depending
-on how you phrase it, you'll either get "ORA-00905: missing keyword" or
-"ORA-00920: invalid relational operator". It's also still less efficient than
-just rendering SQL without the clause altogether (or not issuing the SQL at
-all, if the statement is just a simple search).
-
-The best approach therefore is to avoid the usage of IN given an argument list
-of zero length. Instead, don't emit the Query in the first place, if no rows
-should be returned. The warning is best promoted to a full error condition
-using the Python warnings filter (see http://docs.python.org/library/warnings.html).
-
:ref:`dbengine_logging` for information on how to configure logging
directly.
+ :param empty_in_strategy: The SQL compilation strategy to use when
+ rendering an IN or NOT IN expression for :meth:`.ColumnOperators.in_`
+ where the right-hand side
+ is an empty set. This is a string value that may be one of
+ ``static``, ``dynamic``, or ``dynamic_warn``. The ``static``
+ strategy is the default, and an IN comparison to an empty set
+ will generate a simple false expression "1 != 1". The ``dynamic``
+ strategy behaves like that of SQLAlchemy 1.1 and earlier, emitting
+ a false expression of the form "expr != expr", which has the effect
+ of evaluting to NULL in the case of a null expression.
+ ``dynamic_warn`` is the same as ``dynamic``, however also emits a
+ warning when an empty set is encountered; this because the "dynamic"
+ comparison is typically poorly performing on most databases.
+
+ .. versionadded:: 1.2 Added the ``empty_in_strategy`` setting and
+ additionally defaulted the behavior for empty-set IN comparisons
+ to a static boolean expression.
+
:param encoding: Defaults to ``utf-8``. This is the string
encoding used by SQLAlchemy for string encode/decode
operations which occur within SQLAlchemy, **outside of
supports_right_nested_joins=None,
case_sensitive=True,
supports_native_boolean=None,
+ empty_in_strategy='static',
label_length=None, **kwargs):
if not getattr(self, 'ported_sqla_06', True):
self.supports_native_boolean = supports_native_boolean
self.case_sensitive = case_sensitive
+ self.empty_in_strategy = empty_in_strategy
+ if empty_in_strategy == 'static':
+ self._use_static_in = True
+ elif empty_in_strategy in ('dynamic', 'dynamic_warn'):
+ self._use_static_in = False
+ self._warn_on_empty_in = empty_in_strategy == 'dynamic_warn'
+ else:
+ raise exc.ArgumentError(
+ "empty_in_strategy may be 'static', "
+ "'dynamic', or 'dynamic_warn'")
+
if label_length and label_length > self.max_identifier_length:
raise exc.ArgumentError(
"Label length of %d is greater than this dialect's"
return "NOT %s" % self.visit_binary(
binary, override_operator=operators.match_op)
+ def _emit_empty_in_warning(self):
+ util.warn(
+ 'The IN-predicate was invoked with an '
+ 'empty sequence. This results in a '
+ 'contradiction, which nonetheless can be '
+ 'expensive to evaluate. Consider alternative '
+ 'strategies for improved performance.')
+
+ def visit_empty_in_op_binary(self, binary, operator, **kw):
+ if self.dialect._use_static_in:
+ return "1 != 1"
+ else:
+ if self.dialect._warn_on_empty_in:
+ self._emit_empty_in_warning()
+ return self.process(binary.left != binary.left)
+
+ def visit_empty_notin_op_binary(self, binary, operator, **kw):
+ if self.dialect._use_static_in:
+ return "1 = 1"
+ else:
+ if self.dialect._warn_on_empty_in:
+ self._emit_empty_in_warning()
+ return self.process(binary.left == binary.left)
+
def visit_binary(self, binary, override_operator=None,
eager_grouping=False, **kw):
else:
o = expr._bind_param(op, o)
args.append(o)
- if len(args) == 0:
- # Special case handling for empty IN's, behave like
- # comparison against zero row selectable. We use != to
- # build the contradiction as it handles NULL values
- # appropriately, i.e. "not (x IN ())" should not return NULL
- # values for x.
-
- util.warn('The IN-predicate on "%s" was invoked with an '
- 'empty sequence. This results in a '
- 'contradiction, which nonetheless can be '
- 'expensive to evaluate. Consider alternative '
- 'strategies for improved performance.' % expr)
- if op is operators.in_op:
- return expr != expr
- else:
- return expr == expr
+ if len(args) == 0:
+ op, negate_op = (
+ operators.empty_in_op,
+ operators.empty_notin_op) if op is operators.in_op \
+ else (
+ operators.empty_notin_op,
+ operators.empty_in_op)
return _boolean_compare(expr, op,
ClauseList(*args).self_group(against=op),
"other" may be a tuple/list of column expressions,
or a :func:`~.expression.select` construct.
+ In the case that ``other`` is an empty sequence, the compiler
+ produces an "empty in" expression. This defaults to the
+ expression "1 != 1" to produce false in all cases. The
+ :paramref:`.create_engine.empty_in_strategy` may be used to
+ alter this behavior.
+
+ .. versionchanged:: 1.2 The :meth:`.ColumnOperators.in_` and
+ :meth:`.ColumnOperators.notin_` operators
+ now produce a "static" expression for an empty IN sequence
+ by default.
+
"""
return self.operate(in_op, other)
This is equivalent to using negation with
:meth:`.ColumnOperators.in_`, i.e. ``~x.in_(y)``.
- .. versionadded:: 0.8
+ In the case that ``other`` is an empty sequence, the compiler
+ produces an "empty not in" expression. This defaults to the
+ expression "1 = 1" to produce true in all cases. The
+ :paramref:`.create_engine.empty_in_strategy` may be used to
+ alter this behavior.
+
+ .. versionchanged:: 1.2 The :meth:`.ColumnOperators.in_` and
+ :meth:`.ColumnOperators.notin_` operators
+ now produce a "static" expression for an empty IN sequence
+ by default.
.. seealso::
raise NotImplementedError()
+def empty_in_op(a, b):
+ raise NotImplementedError()
+
+
+def empty_notin_op(a, b):
+ raise NotImplementedError()
+
+
def concat_op(a, b):
return a.concat(b)
ne: 5,
is_distinct_from: 5,
isnot_distinct_from: 5,
+ empty_in_op: 5,
+ empty_notin_op: 5,
gt: 5,
lt: 5,
ge: 5,
dialect=mysql.dialect()
)
- @testing.emits_warning('.*empty sequence.*')
def test_render_binds_as_literal(self):
"""test a compiler that renders binds inline into
SQL in the columns clause."""
self.assert_compile(
select([literal("foo").in_([])]),
- "SELECT 'foo' != 'foo' AS anon_1",
+ "SELECT 1 != 1 AS anon_1",
dialect=dialect
)
dialect=dialect
)
+ empty_in_dialect = default.DefaultDialect(empty_in_strategy='dynamic')
+ empty_in_dialect.statement_compiler = Compiler
+
assert_raises_message(
exc.CompileError,
"Bind parameter 'foo' without a "
"renderable value not allowed here.",
bindparam("foo").in_(
[]).compile,
- dialect=dialect)
+ dialect=empty_in_dialect)
def test_literal(self):
column('othername', String)
)
+ def _dialect(self, empty_in_strategy="static"):
+ return default.DefaultDialect(
+ empty_in_strategy=empty_in_strategy
+ )
+
def test_in_1(self):
self.assert_compile(self.table1.c.myid.in_(['a']),
"mytable.myid IN (:myid_1)")
"FROM myothertable)"
)
- @testing.emits_warning('.*empty sequence.*')
- def test_in_23(self):
- self.assert_compile(self.table1.c.myid.in_([]),
- "mytable.myid != mytable.myid")
-
def test_in_24(self):
self.assert_compile(
select([self.table1.c.myid.in_(select([self.table2.c.otherid]))]),
"mytable.myid IN (NULL)"
)
- @testing.emits_warning('.*empty sequence.*')
- def test_in_29(self):
+ def test_empty_in_dynamic_1(self):
+ self.assert_compile(self.table1.c.myid.in_([]),
+ "mytable.myid != mytable.myid",
+ dialect=self._dialect("dynamic"))
+
+ def test_empty_in_dynamic_2(self):
+ self.assert_compile(self.table1.c.myid.notin_([]),
+ "mytable.myid = mytable.myid",
+ dialect=self._dialect("dynamic"))
+
+ def test_empty_in_dynamic_3(self):
+ self.assert_compile(~self.table1.c.myid.in_([]),
+ "mytable.myid = mytable.myid",
+ dialect=self._dialect("dynamic"))
+
+ def test_empty_in_dynamic_warn_1(self):
+ with testing.expect_warnings(
+ "The IN-predicate was invoked with an empty sequence."):
+ self.assert_compile(self.table1.c.myid.in_([]),
+ "mytable.myid != mytable.myid",
+ dialect=self._dialect("dynamic_warn"))
+
+ def test_empty_in_dynamic_warn_2(self):
+ with testing.expect_warnings(
+ "The IN-predicate was invoked with an empty sequence."):
+ self.assert_compile(self.table1.c.myid.notin_([]),
+ "mytable.myid = mytable.myid",
+ dialect=self._dialect("dynamic_warn"))
+
+ def test_empty_in_dynamic_warn_3(self):
+ with testing.expect_warnings(
+ "The IN-predicate was invoked with an empty sequence."):
+ self.assert_compile(~self.table1.c.myid.in_([]),
+ "mytable.myid = mytable.myid",
+ dialect=self._dialect("dynamic_warn"))
+
+ def test_empty_in_static_1(self):
+ self.assert_compile(self.table1.c.myid.in_([]),
+ "1 != 1")
+
+ def test_empty_in_static_2(self):
self.assert_compile(self.table1.c.myid.notin_([]),
- "mytable.myid = mytable.myid")
+ "1 = 1")
- @testing.emits_warning('.*empty sequence.*')
- def test_in_30(self):
+ def test_empty_in_static_3(self):
self.assert_compile(~self.table1.c.myid.in_([]),
- "mytable.myid = mytable.myid")
+ "1 = 1")
class MathOperatorTest(fixtures.TestBase, testing.AssertsCompiledSQL):
s = users.select(not_(users.c.user_name.in_([])))
r = s.execute().fetchall()
- # All usernames with a value are outside an empty set
- assert len(r) == 2
+ assert len(r) == 3
s = users.select(users.c.user_name.in_(['jack', 'fred']))
r = s.execute().fetchall()
# Null values are not outside any set
assert len(r) == 0
- @testing.emits_warning('.*empty sequence.*')
@testing.fails_on('firebird', "uses sql-92 rules")
@testing.fails_on('sybase', "uses sql-92 rules")
@testing.fails_if(
r = s.execute(search_key='john').fetchall()
assert len(r) == 3
r = s.execute(search_key=None).fetchall()
- assert len(r) == 0
+ assert len(r) == 3
@testing.emits_warning('.*empty sequence.*')
def test_literal_in(self):
r = s.execute().fetchall()
assert len(r) == 3
- @testing.emits_warning('.*empty sequence.*')
@testing.requires.boolean_col_expressions
- def test_in_filtering_advanced(self):
+ def test_empty_in_filtering_static(self):
"""test the behavior of the in_() function when
comparing against an empty collection, specifically
that a proper boolean value is generated.
"""
- users.insert().execute(user_id=7, user_name='jack')
- users.insert().execute(user_id=8, user_name='fred')
- users.insert().execute(user_id=9, user_name=None)
+ with testing.db.connect() as conn:
+ conn.execute(
+ users.insert(),
+ [
+ {'user_id': 7, 'user_name': 'jack'},
+ {'user_id': 8, 'user_name': 'ed'},
+ {'user_id': 9, 'user_name': None}
+ ]
+ )
- s = users.select(users.c.user_name.in_([]) == True) # noqa
- r = s.execute().fetchall()
- assert len(r) == 0
- s = users.select(users.c.user_name.in_([]) == False) # noqa
- r = s.execute().fetchall()
- assert len(r) == 2
- s = users.select(users.c.user_name.in_([]) == None) # noqa
- r = s.execute().fetchall()
- assert len(r) == 1
+ s = users.select(users.c.user_name.in_([]) == True) # noqa
+ r = conn.execute(s).fetchall()
+ assert len(r) == 0
+ s = users.select(users.c.user_name.in_([]) == False) # noqa
+ r = conn.execute(s).fetchall()
+ assert len(r) == 3
+ s = users.select(users.c.user_name.in_([]) == None) # noqa
+ r = conn.execute(s).fetchall()
+ assert len(r) == 0
+
+ @testing.requires.boolean_col_expressions
+ def test_empty_in_filtering_dynamic(self):
+ """test the behavior of the in_() function when
+ comparing against an empty collection, specifically
+ that a proper boolean value is generated.
+
+ """
+
+ engine = engines.testing_engine(
+ options={"empty_in_strategy": "dynamic"})
+
+ with engine.connect() as conn:
+ users.create(engine, checkfirst=True)
+
+ conn.execute(
+ users.insert(),
+ [
+ {'user_id': 7, 'user_name': 'jack'},
+ {'user_id': 8, 'user_name': 'ed'},
+ {'user_id': 9, 'user_name': None}
+ ]
+ )
+
+ s = users.select(users.c.user_name.in_([]) == True) # noqa
+ r = conn.execute(s).fetchall()
+ assert len(r) == 0
+ s = users.select(users.c.user_name.in_([]) == False) # noqa
+ r = conn.execute(s).fetchall()
+ assert len(r) == 2
+ s = users.select(users.c.user_name.in_([]) == None) # noqa
+ r = conn.execute(s).fetchall()
+ assert len(r) == 1
class RequiredBindTest(fixtures.TablesTest):