- reorganize

author Mike Bayer <mike_mp@zzzcomputing.com>

Tue, 2 Sep 2014 00:31:00 +0000 (20:31 -0400)

committer Mike Bayer <mike_mp@zzzcomputing.com>

Tue, 2 Sep 2014 00:31:00 +0000 (20:31 -0400)
author Mike Bayer <mike_mp@zzzcomputing.com>
Tue, 2 Sep 2014 00:31:00 +0000 (20:31 -0400)
committer Mike Bayer <mike_mp@zzzcomputing.com>
Tue, 2 Sep 2014 00:31:00 +0000 (20:31 -0400)
diff --git a/doc/build/changelog/migration_10.rst b/doc/build/changelog/migration_10.rst

index 8f01e99e6e2ec0c59a90ad2315f0a7871b5f6f78..8acaa04458a93b098c29d5f581e12b1ed215e5ca 100644 (file)
--- a/doc/build/changelog/migration_10.rst
+++ b/doc/build/changelog/migration_10.rst
@@ -8,7 +8,7 @@ What's New in SQLAlchemy 1.0?
      undergoing maintenance releases as of May, 2014,
      and SQLAlchemy version 1.0, as of yet unreleased.
  
-    Document last updated: August 26, 2014
+    Document last updated: September 1, 2014
  
  Introduction
  ============
@@ -22,236 +22,372 @@ Please carefully review
  potentially backwards-incompatible changes.
  
  
-.. _behavioral_changes_orm_10:
+New Features
+============
  
-Behavioral Changes - ORM
-========================
+.. _feature_3034:
  
-.. _migration_3061:
+Select/Query LIMIT / OFFSET may be specified as an arbitrary SQL expression
+----------------------------------------------------------------------------
  
-Changes to attribute events and other operations regarding attributes that have no pre-existing value
-------------------------------------------------------------------------------------------------------
+The :meth:`.Select.limit` and :meth:`.Select.offset` methods now accept
+any SQL expression, in addition to integer values, as arguments.  The ORM
+:class:`.Query` object also passes through any expression to the underlying
+:class:`.Select` object.   Typically
+this is used to allow a bound parameter to be passed, which can be substituted
+with a value later::
  
-In this change, the default return value of ``None`` when accessing an object
-is now returned dynamically on each access, rather than implicitly setting the
-attribute's state with a special "set" operation when it is first accessed.
-The visible result of this change is that ``obj.__dict__`` is not implicitly
-modified on get, and there are also some minor behavioral changes
-for :func:`.attributes.get_history` and related functions.
+       sel = select([table]).limit(bindparam('mylimit')).offset(bindparam('myoffset'))
  
-Given an object with no state::
+Dialects which don't support non-integer LIMIT or OFFSET expressions may continue
+to not support this behavior; third party dialects may also need modification
+in order to take advantage of the new behavior.  A dialect which currently
+uses the ``._limit`` or ``._offset`` attributes will continue to function
+for those cases where the limit/offset was specified as a simple integer value.
+However, when a SQL expression is specified, these two attributes will
+instead raise a :class:`.CompileError` on access.  A third-party dialect which
+wishes to support the new feature should now call upon the ``._limit_clause``
+and ``._offset_clause`` attributes to receive the full SQL expression, rather
+than the integer value.
  
-       >>> obj = Foo()
  
-It has always been SQLAlchemy's behavior such that if we access a scalar
-or many-to-one attribute that was never set, it is returned as ``None``::
+Behavioral Improvements
+=======================
  
-       >>> obj.someattr
-       None
+.. _feature_updatemany:
  
-This value of ``None`` is in fact now part of the state of ``obj``, and is
-not unlike as though we had set the attribute explicitly, e.g.
-``obj.someattr = None``.  However, the "set on get" here would behave
-differently as far as history and events.   It would not emit any attribute
-event, and additionally if we view history, we see this::
+UPDATE statements are now batched with executemany() in a flush
+----------------------------------------------------------------
  
-       >>> inspect(obj).attrs.someattr.history
-       History(added=(), unchanged=[None], deleted=())   # 0.9 and below
+UPDATE statements can now be batched within an ORM flush
+into more performant executemany() call, similarly to how INSERT
+statements can be batched; this will be invoked within flush
+based on the following criteria:
  
-That is, it's as though the attribute were always ``None`` and were
-never changed.  This is explicitly different from if we had set the
-attribute first instead::
+* two or more UPDATE statements in sequence involve the identical set of
+  columns to be modified.
  
-       >>> obj = Foo()
-       >>> obj.someattr = None
-       >>> inspect(obj).attrs.someattr.history
-       History(added=[None], unchanged=(), deleted=())  # all versions
+* The statement has no embedded SQL expressions in the SET clause.
  
-The above means that the behavior of our "set" operation can be corrupted
-by the fact that the value was accessed via "get" earlier.  In 1.0, this
-inconsistency has been resolved, by no longer actually setting anything
-when the default "getter" is used.
+* The mapping does not use a :paramref:`~.orm.mapper.version_id_col`, or
+  the backend dialect supports a "sane" rowcount for an executemany()
+  operation; most DBAPIs support this correctly now.
  
-       >>> obj = Foo()
-       >>> obj.someattr
-       None
-       >>> inspect(obj).attrs.someattr.history
-       History(added=(), unchanged=(), deleted=())  # 1.0
-       >>> obj.someattr = None
-       >>> inspect(obj).attrs.someattr.history
-       History(added=[None], unchanged=(), deleted=())
+ORM full object fetches 25% faster
+----------------------------------
  
-The reason the above behavior hasn't had much impact is because the
-INSERT statement in relational databases considers a missing value to be
-the same as NULL in most cases.   Whether SQLAlchemy received a history
-event for a particular attribute set to None or not would usually not matter;
-as the difference between sending None/NULL or not wouldn't have an impact.
-However, as :ticket:`3060` illustrates, there are some seldom edge cases
-where we do in fact want to positively have ``None`` set.  Also, allowing
-the attribute event here means it's now possible to create "default value"
-functions for ORM mapped attributes.
+The mechanics of the ``loading.py`` module as well as the identity map
+have undergone several passes of inlining, refactoring, and pruning, so
+that a raw load of rows now populates ORM-based objects around 25% faster.
+Assuming a 1M row table, a script like the following illustrates the type
+of load that's improved the most::
  
-As part of this change, the generation of the implicit "None" is now disabled
-for other situations where this used to occur; this includes when an
-attribute set operation on a many-to-one is received; previously, the "old" value
-would be "None" if it had been not set otherwise; it now will send the
-value :data:`.orm.attributes.NEVER_SET`, which is a value that may be sent
-to an attribute listener now.   This symbol may also be received when
-calling on mapper utility functions such as :meth:`.Mapper.primary_key_from_instance`;
-if the primary key attributes have no setting at all, whereas the value
-would be ``None`` before, it will now be the :data:`.orm.attributes.NEVER_SET`
-symbol, and no change to the object's state occurs.
+       import time
+       from sqlalchemy import Integer, Column, create_engine, Table
+       from sqlalchemy.orm import Session
+       from sqlalchemy.ext.declarative import declarative_base
  
-:ticket:`3061`
+       Base = declarative_base()
  
-.. _migration_2992:
+       class Foo(Base):
+           __table__ = Table(
+               'foo', Base.metadata,
+               Column('id', Integer, primary_key=True),
+               Column('a', Integer(), nullable=False),
+               Column('b', Integer(), nullable=False),
+               Column('c', Integer(), nullable=False),
+           )
  
-Warnings emitted when coercing full SQL fragments into text()
--------------------------------------------------------------
+       engine = create_engine(
+               'mysql+mysqldb://scott:tiger@localhost/test', echo=True)
  
-Since SQLAlchemy's inception, there has always been an emphasis on not getting
-in the way of the usage of plain text.   The Core and ORM expression systems
-were intended to allow any number of points at which the user can just
-use plain text SQL expressions, not just in the sense that you can send a
-full SQL string to :meth:`.Connection.execute`, but that you can send strings
-with SQL expressions into many functions, such as :meth:`.Select.where`,
-:meth:`.Query.filter`, and :meth:`.Select.order_by`.
+       sess = Session(engine)
  
-Note that by "SQL expressions" we mean a **full fragment of a SQL string**,
-such as::
+       now = time.time()
  
-       # the argument sent to where() is a full SQL expression
-       stmt = select([sometable]).where("somecolumn = 'value'")
+       # avoid using all() so that we don't have the overhead of building
+       # a large list of full objects in memory
+       for obj in sess.query(Foo).yield_per(100).limit(1000000):
+           pass
  
-and we are **not talking about string arguments**, that is, the normal
-behavior of passing string values that become parameterized::
+       print("Total time: %d" % (time.time() - now))
  
-       # This is a normal Core expression with a string argument -
-       # we aren't talking about this!!
-       stmt = select([sometable]).where(sometable.c.somecolumn == 'value')
+Local MacBookPro results bench from 19 seconds for 0.9 down to 14 seconds for
+1.0.  The :meth:`.Query.yield_per` call is always a good idea when batching
+huge numbers of rows, as it prevents the Python interpreter from having
+to allocate a huge amount of memory for all objects and their instrumentation
+at once.  Without the :meth:`.Query.yield_per`, the above script on the
+MacBookPro is 31 seconds on 0.9 and 26 seconds on 1.0, the extra time spent
+setting up very large memory buffers.
  
-The Core tutorial has long featured an example of the use of this technique,
-using a :func:`.select` construct where virtually all components of it
-are specified as straight strings.  However, despite this long-standing
-behavior and example, users are apparently surprised that this behavior
-exists, and when asking around the community, I was unable to find any user
-that was in fact *not* surprised that you can send a full string into a method
-like :meth:`.Query.filter`.
  
-So the change here is to encourage the user to qualify textual strings when
-composing SQL that is partially or fully composed from textual fragments.
-When composing a select as below::
  
-       stmt = select(["a", "b"]).where("a = b").select_from("sometable")
+.. _feature_3176:
  
-The statement is built up normally, with all the same coercions as before.
-However, one will see the following warnings emitted::
+New KeyedTuple implementation dramatically faster
+-------------------------------------------------
  
-       SAWarning: Textual column expression 'a' should be explicitly declared
-       with text('a'), or use column('a') for more specificity
-       (this warning may be suppressed after 10 occurrences)
+We took a look into the :class:`.KeyedTuple` implementation in the hopes
+of improving queries like this::
  
-       SAWarning: Textual column expression 'b' should be explicitly declared
-       with text('b'), or use column('b') for more specificity
-       (this warning may be suppressed after 10 occurrences)
+       rows = sess.query(Foo.a, Foo.b, Foo.c).all()
  
-       SAWarning: Textual SQL expression 'a = b' should be explicitly declared
-       as text('a = b') (this warning may be suppressed after 10 occurrences)
+The :class:`.KeyedTuple` class is used rather than Python's
+``collections.namedtuple()``, because the latter has a very complex
+type-creation routine that benchmarks much slower than :class:`.KeyedTuple`.
+However, when fetching hundreds of thousands of rows,
+``collections.namedtuple()`` quickly overtakes :class:`.KeyedTuple` which
+becomes dramatically slower as instance invocation goes up.   What to do?
+A new type that hedges between the approaches of both.   Benching
+all three types for "size" (number of rows returned) and "num"
+(number of distinct queries), the new "lightweight keyed tuple" either
+outperforms both, or lags very slightly behind the faster object, based on
+which scenario.  In the "sweet spot", where we are both creating a good number
+of new types as well as fetching a good number of rows, the lightweight
+object totally smokes both namedtuple and KeyedTuple::
  
-       SAWarning: Textual SQL FROM expression 'sometable' should be explicitly
-       declared as text('sometable'), or use table('sometable') for more
-       specificity (this warning may be suppressed after 10 occurrences)
+       -----------------
+       size=10 num=10000                 # few rows, lots of queries
+       namedtuple: 3.60302400589         # namedtuple falls over
+       keyedtuple: 0.255059957504        # KeyedTuple very fast
+       lw keyed tuple: 0.582715034485    # lw keyed trails right on KeyedTuple
+       -----------------
+       size=100 num=1000                 # <--- sweet spot
+       namedtuple: 0.365247011185
+       keyedtuple: 0.24896979332
+       lw keyed tuple: 0.0889317989349   # lw keyed blows both away!
+       -----------------
+       size=10000 num=100
+       namedtuple: 0.572599887848
+       keyedtuple: 2.54251694679
+       lw keyed tuple: 0.613876104355
+       -----------------
+       size=1000000 num=10               # few queries, lots of rows
+       namedtuple: 5.79669594765         # namedtuple very fast
+       keyedtuple: 28.856498003          # KeyedTuple falls over
+       lw keyed tuple: 6.74346804619     # lw keyed trails right on namedtuple
  
-These warnings attempt to show exactly where the issue is by displaying
-the parameters as well as where the string was received.
-The warnings make use of the :ref:`feature_3178` so that parameterized warnings
-can be emitted safely without running out of memory, and as always, if
-one wishes the warnings to be exceptions, the
-`Python Warnings Filter <https://docs.python.org/2/library/warnings.html>`_
-should be used::
  
+:ticket:`3176`
+
+.. _feature_3178:
+
+New systems to safely emit parameterized warnings
+-------------------------------------------------
+
+For a long time, there has been a restriction that warning messages could not
+refer to data elements, such that a particular function might emit an
+infinite number of unique warnings.  The key place this occurs is in the
+``Unicode type received non-unicode bind param value`` warning.  Placing
+the data value in this message would mean that the Python ``__warningregistry__``
+for that module, or in some cases the Python-global ``warnings.onceregistry``,
+would grow unbounded, as in most warning scenarios, one of these two collections
+is populated with every distinct warning message.
+
+The change here is that by using a special ``string`` type that purposely
+changes how the string is hashed, we can control that a large number of
+parameterized messages are hashed only on a small set of possible hash
+values, such that a warning such as ``Unicode type received non-unicode
+bind param value`` can be tailored to be emitted only a specific number
+of times; beyond that, the Python warnings registry will begin recording
+them as duplicates.
+
+To illustrate, the following test script will show only ten warnings being
+emitted for ten of the parameter sets, out of a total of 1000::
+
+       from sqlalchemy import create_engine, Unicode, select, cast
+       import random
         import warnings
-       warnings.simplefilter("error")   # all warnings raise an exception
  
-Given the above warnings, our statement works just fine, but
-to get rid of the warnings we would rewrite our statement as follows::
+       e = create_engine("sqlite://")
  
-       from sqlalchemy import select, text
-       stmt = select([
-            text("a"),
-            text("b")
-        ]).where(text("a = b")).select_from(text("sometable"))
+       # Use the "once" filter (which is also the default for Python
+       # warnings).  Exactly ten of these warnings will
+       # be emitted; beyond that, the Python warnings registry will accumulate
+       # new values as dupes of one of the ten existing.
+       warnings.filterwarnings("once")
  
-and as the warnings suggest, we can give our statement more specificity
-about the text if we use :func:`.column` and :func:`.table`::
+       for i in range(1000):
+           e.execute(select([cast(
+               ('foo_%d' % random.randint(0, 1000000)).encode('ascii'), Unicode)]))
  
-       from sqlalchemy import select, text, column, table
+The format of the warning here is::
  
-       stmt = select([column("a"), column("b")]).\\
-               where(text("a = b")).select_from(table("sometable"))
+       /path/lib/sqlalchemy/sql/sqltypes.py:186: SAWarning: Unicode type received
+         non-unicode bind param value 'foo_4852'. (this warning may be
+         suppressed after 10 occurrences)
  
-Where note also that :func:`.table` and :func:`.column` can now
-be imported from "sqlalchemy" without the "sql" part.
  
-The behavior here applies to :func:`.select` as well as to key methods
-on :class:`.Query`, including :meth:`.Query.filter`,
-:meth:`.Query.from_statement` and :meth:`.Query.having`.
+:ticket:`3178`
  
-ORDER BY and GROUP BY are special cases
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+.. _feature_2963:
  
-There is one case where usage of a string has special meaning, and as part
-of this change we have enhanced its functionality.  When we have a
-:func:`.select` or :class:`.Query` that refers to some column name or named
-label, we might want to GROUP BY and/or ORDER BY known columns or labels::
+.info dictionary improvements
+-----------------------------
  
-       stmt = select([
-               user.c.name,
-               func.count(user.c.id).label("id_count")
-       ]).group_by("name").order_by("id_count")
+The :attr:`.InspectionAttr.info` collection is now available on every kind
+of object that one would retrieve from the :attr:`.Mapper.all_orm_descriptors`
+collection.  This includes :class:`.hybrid_property` and :func:`.association_proxy`.
+However, as these objects are class-bound descriptors, they must be accessed
+**separately** from the class to which they are attached in order to get
+at the attribute.  Below this is illustared using the
+:attr:`.Mapper.all_orm_descriptors` namespace::
  
-In the above statement we expect to see "ORDER BY id_count", as opposed to a
-re-statement of the function.   The string argument given is actively
-matched to an entry in the columns clause during compilation, so the above
-statement would produce as we expect, without warnings::
+       class SomeObject(Base):
+           # ...
+
+           @hybrid_property
+           def some_prop(self):
+               return self.value + 5
+
+
+       inspect(SomeObject).all_orm_descriptors.some_prop.info['foo'] = 'bar'
+
+It is also available as a constructor argument for all :class:`.SchemaItem`
+objects (e.g. :class:`.ForeignKey`, :class:`.UniqueConstraint` etc.) as well
+as remaining ORM constructs such as :func:`.orm.synonym`.
+
+:ticket:`2971`
+
+:ticket:`2963`
+
+.. _migration_3177:
+
+Change to single-table-inheritance criteria when using from_self(), count()
+---------------------------------------------------------------------------
+
+Given a single-table inheritance mapping, such as::
+
+       class Widget(Base):
+               __table__ = 'widget_table'
+
+       class FooWidget(Widget):
+               pass
+
+Using :meth:`.Query.from_self` or :meth:`.Query.count` against a subclass
+would produce a subquery, but then add the "WHERE" criteria for subtypes
+to the outside::
+
+       sess.query(FooWidget).from_self().all()
+
+rendering::
+
+       SELECT
+               anon_1.widgets_id AS anon_1_widgets_id,
+               anon_1.widgets_type AS anon_1_widgets_type
+       FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type,
+       FROM widgets) AS anon_1
+       WHERE anon_1.widgets_type IN (?)
+
+The issue with this is that if the inner query does not specify all
+columns, then we can't add the WHERE clause on the outside (it actually tries,
+and produces a bad query).  This decision
+apparently goes way back to 0.6.5 with the note "may need to make more
+adjustments to this".   Well, those adjustments have arrived!  So now the
+above query will render::
+
+       SELECT
+               anon_1.widgets_id AS anon_1_widgets_id,
+               anon_1.widgets_type AS anon_1_widgets_type
+       FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type,
+       FROM widgets
+       WHERE widgets.type IN (?)) AS anon_1
+
+So that queries that don't include "type" will still work!::
+
+       sess.query(FooWidget.id).count()
+
+Renders::
+
+       SELECT count(*) AS count_1
+       FROM (SELECT widgets.id AS widgets_id
+       FROM widgets
+       WHERE widgets.type IN (?)) AS anon_1
+
+
+:ticket:`3177`
+
+.. _behavioral_changes_orm_10:
+
+Behavioral Changes - ORM
+========================
+
+.. _migration_3061:
+
+Changes to attribute events and other operations regarding attributes that have no pre-existing value
+------------------------------------------------------------------------------------------------------
+
+In this change, the default return value of ``None`` when accessing an object
+is now returned dynamically on each access, rather than implicitly setting the
+attribute's state with a special "set" operation when it is first accessed.
+The visible result of this change is that ``obj.__dict__`` is not implicitly
+modified on get, and there are also some minor behavioral changes
+for :func:`.attributes.get_history` and related functions.
+
+Given an object with no state::
+
+       >>> obj = Foo()
  
-       SELECT users.name, count(users.id) AS id_count
-       FROM users GROUP BY users.name ORDER BY id_count
+It has always been SQLAlchemy's behavior such that if we access a scalar
+or many-to-one attribute that was never set, it is returned as ``None``::
  
-However, if we refer to a name that cannot be located, then we get
-the warning again, as below::
+       >>> obj.someattr
+       None
  
-       stmt = select([
-            user.c.name,
-            func.count(user.c.id).label("id_count")
-        ]).order_by("some_label")
+This value of ``None`` is in fact now part of the state of ``obj``, and is
+not unlike as though we had set the attribute explicitly, e.g.
+``obj.someattr = None``.  However, the "set on get" here would behave
+differently as far as history and events.   It would not emit any attribute
+event, and additionally if we view history, we see this::
  
-The output does what we say, but again it warns us::
+       >>> inspect(obj).attrs.someattr.history
+       History(added=(), unchanged=[None], deleted=())   # 0.9 and below
  
-       SAWarning: Can't resolve label reference 'some_label'; converting to
-       text() (this warning may be suppressed after 10 occurrences)
+That is, it's as though the attribute were always ``None`` and were
+never changed.  This is explicitly different from if we had set the
+attribute first instead::
  
-       SELECT users.name, count(users.id) AS id_count
-       FROM users ORDER BY some_label
+       >>> obj = Foo()
+       >>> obj.someattr = None
+       >>> inspect(obj).attrs.someattr.history
+       History(added=[None], unchanged=(), deleted=())  # all versions
  
-The above behavior applies to all those places where we might want to refer
-to a so-called "label reference"; ORDER BY and GROUP BY, but also within an
-OVER clause as well as a DISTINCT ON clause that refers to columns (e.g. the
-Postgresql syntax).
+The above means that the behavior of our "set" operation can be corrupted
+by the fact that the value was accessed via "get" earlier.  In 1.0, this
+inconsistency has been resolved, by no longer actually setting anything
+when the default "getter" is used.
  
-We can still specify any arbitrary expression for ORDER BY or others using
-:func:`.text`::
+       >>> obj = Foo()
+       >>> obj.someattr
+       None
+       >>> inspect(obj).attrs.someattr.history
+       History(added=(), unchanged=(), deleted=())  # 1.0
+       >>> obj.someattr = None
+       >>> inspect(obj).attrs.someattr.history
+       History(added=[None], unchanged=(), deleted=())
  
-       stmt = select([users]).order_by(text("some special expression"))
+The reason the above behavior hasn't had much impact is because the
+INSERT statement in relational databases considers a missing value to be
+the same as NULL in most cases.   Whether SQLAlchemy received a history
+event for a particular attribute set to None or not would usually not matter;
+as the difference between sending None/NULL or not wouldn't have an impact.
+However, as :ticket:`3060` illustrates, there are some seldom edge cases
+where we do in fact want to positively have ``None`` set.  Also, allowing
+the attribute event here means it's now possible to create "default value"
+functions for ORM mapped attributes.
  
-The upshot of the whole change is that SQLAlchemy now would like us
-to tell it when a string is sent that this string is explicitly
-a :func:`.text` construct, or a column, table, etc., and if we use it as a
-label name in an order by, group by, or other expression, SQLAlchemy expects
-that the string resolves to something known, else it should again
-be qualified with :func:`.text` or similar.
+As part of this change, the generation of the implicit "None" is now disabled
+for other situations where this used to occur; this includes when an
+attribute set operation on a many-to-one is received; previously, the "old" value
+would be "None" if it had been not set otherwise; it now will send the
+value :data:`.orm.attributes.NEVER_SET`, which is a value that may be sent
+to an attribute listener now.   This symbol may also be received when
+calling on mapper utility functions such as :meth:`.Mapper.primary_key_from_instance`;
+if the primary key attributes have no setting at all, whereas the value
+would be ``None`` before, it will now be the :data:`.orm.attributes.NEVER_SET`
+symbol, and no change to the object's state occurs.
  
-:ticket:`2992`
+:ticket:`3061`
  
  .. _migration_yield_per_eager_loading:
  
@@ -406,344 +542,210 @@ from the unit of work.
  Behavioral Changes - Core
  =========================
  
-.. _change_3163:
-
-Event listeners can not be added or removed from within that event's runner
----------------------------------------------------------------------------
-
-Removal of an event listener from inside that same event itself would
-modify  the elements of a list during iteration, which would cause
-still-attached event listeners to silently fail to fire.    To prevent
-this while still maintaining performance, the lists have been replaced
-with ``collections.deque()``, which does not allow any additions or
-removals during iteration, and instead raises ``RuntimeError``.
-
-:ticket:`3163`
-
-.. _change_3169:
-
-The INSERT...FROM SELECT construct now implies ``inline=True``
---------------------------------------------------------------
-
-Using :meth:`.Insert.from_select` now implies ``inline=True``
-on :func:`.insert`.  This helps to fix a bug where an
-INSERT...FROM SELECT construct would inadvertently be compiled
-as "implicit returning" on supporting backends, which would
-cause breakage in the case of an INSERT that inserts zero rows
-(as implicit returning expects a row), as well as arbitrary
-return data in the case of an INSERT that inserts multiple
-rows (e.g. only the first row of many).
-A similar change is also applied to an INSERT..VALUES
-with multiple parameter sets; implicit RETURNING will no longer emit
-for this statement either.  As both of these constructs deal
-with varible numbers of rows, the
-:attr:`.ResultProxy.inserted_primary_key` accessor does not
-apply.   Previously, there was a documentation note that one
-may prefer ``inline=True`` with INSERT..FROM SELECT as some databases
-don't support returning and therefore can't do "implicit" returning,
-but there's no reason an INSERT...FROM SELECT needs implicit returning
-in any case.   Regular explicit :meth:`.Insert.returning` should
-be used to return variable numbers of result rows if inserted
-data is needed.
-
-:ticket:`3169`
-
-.. _change_3027:
-
-``autoload_with`` now implies ``autoload=True``
------------------------------------------------
-
-A :class:`.Table` can be set up for reflection by passing
-:paramref:`.Table.autoload_with` alone::
-
-       my_table = Table('my_table', metadata, autoload_with=some_engine)
-
-:ticket:`3027`
-
-
-New Features
-============
-
-.. _feature_3034:
-
-Select/Query LIMIT / OFFSET may be specified as an arbitrary SQL expression
-----------------------------------------------------------------------------
-
-The :meth:`.Select.limit` and :meth:`.Select.offset` methods now accept
-any SQL expression, in addition to integer values, as arguments.  The ORM
-:class:`.Query` object also passes through any expression to the underlying
-:class:`.Select` object.   Typically
-this is used to allow a bound parameter to be passed, which can be substituted
-with a value later::
-
-       sel = select([table]).limit(bindparam('mylimit')).offset(bindparam('myoffset'))
-
-Dialects which don't support non-integer LIMIT or OFFSET expressions may continue
-to not support this behavior; third party dialects may also need modification
-in order to take advantage of the new behavior.  A dialect which currently
-uses the ``._limit`` or ``._offset`` attributes will continue to function
-for those cases where the limit/offset was specified as a simple integer value.
-However, when a SQL expression is specified, these two attributes will
-instead raise a :class:`.CompileError` on access.  A third-party dialect which
-wishes to support the new feature should now call upon the ``._limit_clause``
-and ``._offset_clause`` attributes to receive the full SQL expression, rather
-than the integer value.
-
-Behavioral Improvements
-=======================
-
-.. _feature_updatemany:
-
-UPDATE statements are now batched with executemany() in a flush
-----------------------------------------------------------------
-
-UPDATE statements can now be batched within an ORM flush
-into more performant executemany() call, similarly to how INSERT
-statements can be batched; this will be invoked within flush
-based on the following criteria:
-
-* two or more UPDATE statements in sequence involve the identical set of
-  columns to be modified.
-
-* The statement has no embedded SQL expressions in the SET clause.
-
-* The mapping does not use a :paramref:`~.orm.mapper.version_id_col`, or
-  the backend dialect supports a "sane" rowcount for an executemany()
-  operation; most DBAPIs support this correctly now.
-
-ORM full object fetches 25% faster
-----------------------------------
-
-The mechanics of the ``loading.py`` module as well as the identity map
-have undergone several passes of inlining, refactoring, and pruning, so
-that a raw load of rows now populates ORM-based objects around 25% faster.
-Assuming a 1M row table, a script like the following illustrates the type
-of load that's improved the most::
-
-       import time
-       from sqlalchemy import Integer, Column, create_engine, Table
-       from sqlalchemy.orm import Session
-       from sqlalchemy.ext.declarative import declarative_base
-
-       Base = declarative_base()
-
-       class Foo(Base):
-           __table__ = Table(
-               'foo', Base.metadata,
-               Column('id', Integer, primary_key=True),
-               Column('a', Integer(), nullable=False),
-               Column('b', Integer(), nullable=False),
-               Column('c', Integer(), nullable=False),
-           )
-
-       engine = create_engine(
-               'mysql+mysqldb://scott:tiger@localhost/test', echo=True)
-
-       sess = Session(engine)
-
-       now = time.time()
-
-       # avoid using all() so that we don't have the overhead of building
-       # a large list of full objects in memory
-       for obj in sess.query(Foo).yield_per(100).limit(1000000):
-           pass
+.. _migration_2992:
  
-       print("Total time: %d" % (time.time() - now))
+Warnings emitted when coercing full SQL fragments into text()
+-------------------------------------------------------------
  
-Local MacBookPro results bench from 19 seconds for 0.9 down to 14 seconds for
-1.0.  The :meth:`.Query.yield_per` call is always a good idea when batching
-huge numbers of rows, as it prevents the Python interpreter from having
-to allocate a huge amount of memory for all objects and their instrumentation
-at once.  Without the :meth:`.Query.yield_per`, the above script on the
-MacBookPro is 31 seconds on 0.9 and 26 seconds on 1.0, the extra time spent
-setting up very large memory buffers.
+Since SQLAlchemy's inception, there has always been an emphasis on not getting
+in the way of the usage of plain text.   The Core and ORM expression systems
+were intended to allow any number of points at which the user can just
+use plain text SQL expressions, not just in the sense that you can send a
+full SQL string to :meth:`.Connection.execute`, but that you can send strings
+with SQL expressions into many functions, such as :meth:`.Select.where`,
+:meth:`.Query.filter`, and :meth:`.Select.order_by`.
  
+Note that by "SQL expressions" we mean a **full fragment of a SQL string**,
+such as::
  
+       # the argument sent to where() is a full SQL expression
+       stmt = select([sometable]).where("somecolumn = 'value'")
  
-.. _feature_3176:
+and we are **not talking about string arguments**, that is, the normal
+behavior of passing string values that become parameterized::
  
-New KeyedTuple implementation dramatically faster
--------------------------------------------------
+       # This is a normal Core expression with a string argument -
+       # we aren't talking about this!!
+       stmt = select([sometable]).where(sometable.c.somecolumn == 'value')
  
-We took a look into the :class:`.KeyedTuple` implementation in the hopes
-of improving queries like this::
+The Core tutorial has long featured an example of the use of this technique,
+using a :func:`.select` construct where virtually all components of it
+are specified as straight strings.  However, despite this long-standing
+behavior and example, users are apparently surprised that this behavior
+exists, and when asking around the community, I was unable to find any user
+that was in fact *not* surprised that you can send a full string into a method
+like :meth:`.Query.filter`.
  
-       rows = sess.query(Foo.a, Foo.b, Foo.c).all()
+So the change here is to encourage the user to qualify textual strings when
+composing SQL that is partially or fully composed from textual fragments.
+When composing a select as below::
  
-The :class:`.KeyedTuple` class is used rather than Python's
-``collections.namedtuple()``, because the latter has a very complex
-type-creation routine that benchmarks much slower than :class:`.KeyedTuple`.
-However, when fetching hundreds of thousands of rows,
-``collections.namedtuple()`` quickly overtakes :class:`.KeyedTuple` which
-becomes dramatically slower as instance invocation goes up.   What to do?
-A new type that hedges between the approaches of both.   Benching
-all three types for "size" (number of rows returned) and "num"
-(number of distinct queries), the new "lightweight keyed tuple" either
-outperforms both, or lags very slightly behind the faster object, based on
-which scenario.  In the "sweet spot", where we are both creating a good number
-of new types as well as fetching a good number of rows, the lightweight
-object totally smokes both namedtuple and KeyedTuple::
+       stmt = select(["a", "b"]).where("a = b").select_from("sometable")
  
-       -----------------
-       size=10 num=10000                 # few rows, lots of queries
-       namedtuple: 3.60302400589         # namedtuple falls over
-       keyedtuple: 0.255059957504        # KeyedTuple very fast
-       lw keyed tuple: 0.582715034485    # lw keyed trails right on KeyedTuple
-       -----------------
-       size=100 num=1000                 # <--- sweet spot
-       namedtuple: 0.365247011185
-       keyedtuple: 0.24896979332
-       lw keyed tuple: 0.0889317989349   # lw keyed blows both away!
-       -----------------
-       size=10000 num=100
-       namedtuple: 0.572599887848
-       keyedtuple: 2.54251694679
-       lw keyed tuple: 0.613876104355
-       -----------------
-       size=1000000 num=10               # few queries, lots of rows
-       namedtuple: 5.79669594765         # namedtuple very fast
-       keyedtuple: 28.856498003          # KeyedTuple falls over
-       lw keyed tuple: 6.74346804619     # lw keyed trails right on namedtuple
+The statement is built up normally, with all the same coercions as before.
+However, one will see the following warnings emitted::
  
+       SAWarning: Textual column expression 'a' should be explicitly declared
+       with text('a'), or use column('a') for more specificity
+       (this warning may be suppressed after 10 occurrences)
  
-:ticket:`3176`
+       SAWarning: Textual column expression 'b' should be explicitly declared
+       with text('b'), or use column('b') for more specificity
+       (this warning may be suppressed after 10 occurrences)
  
-.. _feature_3178:
+       SAWarning: Textual SQL expression 'a = b' should be explicitly declared
+       as text('a = b') (this warning may be suppressed after 10 occurrences)
  
-New systems to safely emit parameterized warnings
--------------------------------------------------
+       SAWarning: Textual SQL FROM expression 'sometable' should be explicitly
+       declared as text('sometable'), or use table('sometable') for more
+       specificity (this warning may be suppressed after 10 occurrences)
  
-For a long time, there has been a restriction that warning messages could not
-refer to data elements, such that a particular function might emit an
-infinite number of unique warnings.  The key place this occurs is in the
-``Unicode type received non-unicode bind param value`` warning.  Placing
-the data value in this message would mean that the Python ``__warningregistry__``
-for that module, or in some cases the Python-global ``warnings.onceregistry``,
-would grow unbounded, as in most warning scenarios, one of these two collections
-is populated with every distinct warning message.
+These warnings attempt to show exactly where the issue is by displaying
+the parameters as well as where the string was received.
+The warnings make use of the :ref:`feature_3178` so that parameterized warnings
+can be emitted safely without running out of memory, and as always, if
+one wishes the warnings to be exceptions, the
+`Python Warnings Filter <https://docs.python.org/2/library/warnings.html>`_
+should be used::
  
-The change here is that by using a special ``string`` type that purposely
-changes how the string is hashed, we can control that a large number of
-parameterized messages are hashed only on a small set of possible hash
-values, such that a warning such as ``Unicode type received non-unicode
-bind param value`` can be tailored to be emitted only a specific number
-of times; beyond that, the Python warnings registry will begin recording
-them as duplicates.
+       import warnings
+       warnings.simplefilter("error")   # all warnings raise an exception
  
-To illustrate, the following test script will show only ten warnings being
-emitted for ten of the parameter sets, out of a total of 1000::
+Given the above warnings, our statement works just fine, but
+to get rid of the warnings we would rewrite our statement as follows::
  
-       from sqlalchemy import create_engine, Unicode, select, cast
-       import random
-       import warnings
+       from sqlalchemy import select, text
+       stmt = select([
+            text("a"),
+            text("b")
+        ]).where(text("a = b")).select_from(text("sometable"))
  
-       e = create_engine("sqlite://")
+and as the warnings suggest, we can give our statement more specificity
+about the text if we use :func:`.column` and :func:`.table`::
  
-       # Use the "once" filter (which is also the default for Python
-       # warnings).  Exactly ten of these warnings will
-       # be emitted; beyond that, the Python warnings registry will accumulate
-       # new values as dupes of one of the ten existing.
-       warnings.filterwarnings("once")
+       from sqlalchemy import select, text, column, table
  
-       for i in range(1000):
-           e.execute(select([cast(
-               ('foo_%d' % random.randint(0, 1000000)).encode('ascii'), Unicode)]))
+       stmt = select([column("a"), column("b")]).\
+               where(text("a = b")).select_from(table("sometable"))
  
-The format of the warning here is::
+Where note also that :func:`.table` and :func:`.column` can now
+be imported from "sqlalchemy" without the "sql" part.
  
-       /path/lib/sqlalchemy/sql/sqltypes.py:186: SAWarning: Unicode type received
-         non-unicode bind param value 'foo_4852'. (this warning may be
-         suppressed after 10 occurrences)
+The behavior here applies to :func:`.select` as well as to key methods
+on :class:`.Query`, including :meth:`.Query.filter`,
+:meth:`.Query.from_statement` and :meth:`.Query.having`.
  
+ORDER BY and GROUP BY are special cases
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  
-:ticket:`3178`
+There is one case where usage of a string has special meaning, and as part
+of this change we have enhanced its functionality.  When we have a
+:func:`.select` or :class:`.Query` that refers to some column name or named
+label, we might want to GROUP BY and/or ORDER BY known columns or labels::
  
-.. _feature_2963:
+       stmt = select([
+               user.c.name,
+               func.count(user.c.id).label("id_count")
+       ]).group_by("name").order_by("id_count")
  
-.info dictionary improvements
------------------------------
+In the above statement we expect to see "ORDER BY id_count", as opposed to a
+re-statement of the function.   The string argument given is actively
+matched to an entry in the columns clause during compilation, so the above
+statement would produce as we expect, without warnings (though note that
+the ``"name"`` expression has been resolved to ``users.name``!)::
  
-The :attr:`.InspectionAttr.info` collection is now available on every kind
-of object that one would retrieve from the :attr:`.Mapper.all_orm_descriptors`
-collection.  This includes :class:`.hybrid_property` and :func:`.association_proxy`.
-However, as these objects are class-bound descriptors, they must be accessed
-**separately** from the class to which they are attached in order to get
-at the attribute.  Below this is illustared using the
-:attr:`.Mapper.all_orm_descriptors` namespace::
+       SELECT users.name, count(users.id) AS id_count
+       FROM users GROUP BY users.name ORDER BY id_count
  
-       class SomeObject(Base):
-           # ...
+However, if we refer to a name that cannot be located, then we get
+the warning again, as below::
  
-           @hybrid_property
-           def some_prop(self):
-               return self.value + 5
+       stmt = select([
+            user.c.name,
+            func.count(user.c.id).label("id_count")
+        ]).order_by("some_label")
  
+The output does what we say, but again it warns us::
  
-       inspect(SomeObject).all_orm_descriptors.some_prop.info['foo'] = 'bar'
+       SAWarning: Can't resolve label reference 'some_label'; converting to
+       text() (this warning may be suppressed after 10 occurrences)
  
-It is also available as a constructor argument for all :class:`.SchemaItem`
-objects (e.g. :class:`.ForeignKey`, :class:`.UniqueConstraint` etc.) as well
-as remaining ORM constructs such as :func:`.orm.synonym`.
+       SELECT users.name, count(users.id) AS id_count
+       FROM users ORDER BY some_label
  
-:ticket:`2971`
+The above behavior applies to all those places where we might want to refer
+to a so-called "label reference"; ORDER BY and GROUP BY, but also within an
+OVER clause as well as a DISTINCT ON clause that refers to columns (e.g. the
+Postgresql syntax).
  
-:ticket:`2963`
+We can still specify any arbitrary expression for ORDER BY or others using
+:func:`.text`::
  
-.. _migration_3177:
+       stmt = select([users]).order_by(text("some special expression"))
  
-Change to single-table-inheritance criteria when using from_self(), count()
----------------------------------------------------------------------------
+The upshot of the whole change is that SQLAlchemy now would like us
+to tell it when a string is sent that this string is explicitly
+a :func:`.text` construct, or a column, table, etc., and if we use it as a
+label name in an order by, group by, or other expression, SQLAlchemy expects
+that the string resolves to something known, else it should again
+be qualified with :func:`.text` or similar.
  
-Given a single-table inheritance mapping, such as::
+:ticket:`2992`
  
-       class Widget(Base):
-               __table__ = 'widget_table'
+.. _change_3163:
  
-       class FooWidget(Widget):
-               pass
+Event listeners can not be added or removed from within that event's runner
+---------------------------------------------------------------------------
  
-Using :meth:`.Query.from_self` or :meth:`.Query.count` against a subclass
-would produce a subquery, but then add the "WHERE" criteria for subtypes
-to the outside::
+Removal of an event listener from inside that same event itself would
+modify  the elements of a list during iteration, which would cause
+still-attached event listeners to silently fail to fire.    To prevent
+this while still maintaining performance, the lists have been replaced
+with ``collections.deque()``, which does not allow any additions or
+removals during iteration, and instead raises ``RuntimeError``.
  
-       sess.query(FooWidget).from_self().all()
+:ticket:`3163`
  
-rendering::
+.. _change_3169:
  
-       SELECT
-               anon_1.widgets_id AS anon_1_widgets_id,
-               anon_1.widgets_type AS anon_1_widgets_type
-       FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type,
-       FROM widgets) AS anon_1
-       WHERE anon_1.widgets_type IN (?)
+The INSERT...FROM SELECT construct now implies ``inline=True``
+--------------------------------------------------------------
  
-The issue with this is that if the inner query does not specify all
-columns, then we can't add the WHERE clause on the outside (it actually tries,
-and produces a bad query).  This decision
-apparently goes way back to 0.6.5 with the note "may need to make more
-adjustments to this".   Well, those adjustments have arrived!  So now the
-above query will render::
+Using :meth:`.Insert.from_select` now implies ``inline=True``
+on :func:`.insert`.  This helps to fix a bug where an
+INSERT...FROM SELECT construct would inadvertently be compiled
+as "implicit returning" on supporting backends, which would
+cause breakage in the case of an INSERT that inserts zero rows
+(as implicit returning expects a row), as well as arbitrary
+return data in the case of an INSERT that inserts multiple
+rows (e.g. only the first row of many).
+A similar change is also applied to an INSERT..VALUES
+with multiple parameter sets; implicit RETURNING will no longer emit
+for this statement either.  As both of these constructs deal
+with varible numbers of rows, the
+:attr:`.ResultProxy.inserted_primary_key` accessor does not
+apply.   Previously, there was a documentation note that one
+may prefer ``inline=True`` with INSERT..FROM SELECT as some databases
+don't support returning and therefore can't do "implicit" returning,
+but there's no reason an INSERT...FROM SELECT needs implicit returning
+in any case.   Regular explicit :meth:`.Insert.returning` should
+be used to return variable numbers of result rows if inserted
+data is needed.
  
-       SELECT
-               anon_1.widgets_id AS anon_1_widgets_id,
-               anon_1.widgets_type AS anon_1_widgets_type
-       FROM (SELECT widgets.id AS widgets_id, widgets.type AS widgets_type,
-       FROM widgets
-       WHERE widgets.type IN (?)) AS anon_1
+:ticket:`3169`
  
-So that queries that don't include "type" will still work!::
+.. _change_3027:
  
-       sess.query(FooWidget.id).count()
+``autoload_with`` now implies ``autoload=True``
+-----------------------------------------------
  
-Renders::
+A :class:`.Table` can be set up for reflection by passing
+:paramref:`.Table.autoload_with` alone::
  
-       SELECT count(*) AS count_1
-       FROM (SELECT widgets.id AS widgets_id
-       FROM widgets
-       WHERE widgets.type IN (?)) AS anon_1
+       my_table = Table('my_table', metadata, autoload_with=some_engine)
  
+:ticket:`3027`
  
-:ticket:`3177`
  
  
  Dialect Changes
author	Mike Bayer <mike_mp@zzzcomputing.com>
	Tue, 2 Sep 2014 00:31:00 +0000 (20:31 -0400)
committer	Mike Bayer <mike_mp@zzzcomputing.com>
	Tue, 2 Sep 2014 00:31:00 +0000 (20:31 -0400)