See the recipe at `FilteredQuery <http://www.sqlalchemy.org/trac/wiki/UsageRecipes/FilteredQuery>`_.
+.. _faq_query_deduplicating:
+
+My Query does not return the same number of objects as query.count() tells me - why?
+-------------------------------------------------------------------------------------
+
+The :class:`.Query` object, when asked to return a list of ORM-mapped objects,
+will **deduplicate the objects based on primary key**. That is, if we
+for example use the ``User`` mapping described at :ref:`ormtutorial_toplevel`,
+and we had a SQL query like the following::
+
+ q = session.query(User).outerjoin(User.addresses).filter(User.name == 'jack')
+
+Above, the sample data used in the tutorial has two rows in the ``addresses``
+table for the ``users`` row with the name ``'jack'``, primary key value 5.
+If we ask the above query for a :meth:`.Query.count`, we will get the answer
+**2**::
+
+ >>> q.count()
+ 2
+
+However, if we run :meth:`.Query.all` or iterate over the query, we get back
+**one element**::
+
+ >>> q.all()
+ [User(id=5, name='jack', ...)]
+
+This is because when the :class:`.Query` object returns full entities, they
+are **deduplicated**. This does not occur if we instead request individual
+columns back::
+
+ >>> session.query(User.id, User.name).outerjoin(User.addresses).filter(User.name == 'jack').all()
+ [(5, 'jack'), (5, 'jack')]
+
+There are two main reasons the :class:`.Query` will deduplicate:
+
+* **To allow joined eager loading to work correctly** - :ref:`joined_eager_loading`
+ works by querying rows using joins against related tables, where it then routes
+ rows from those joins into collections upon the lead objects. In order to do this,
+ it has to fetch rows where the lead object primary key is repeated for each
+ sub-entry. This pattern can then continue into further sub-collections such
+ that a multiple of rows may be processed for a single lead object, such as
+ ``User(id=5)``. The dedpulication allows us to receive objects in the way they
+ were queried, e.g. all the ``User()`` objects whose name is ``'jack'`` which
+ for us is one object, with
+ the ``User.addresses`` collection eagerly loaded as was indicated either
+ by ``lazy='joined'`` on the :func:`.relationship` or via the :func:`.joinedload`
+ option. For consistency, the deduplication is still applied whether or not
+ the joinedload is established, as the key philosophy behind eager loading
+ is that these options never affect the result.
+
+* **To eliminate confusion regarding the identity map** - this is admittedly
+ the less critical reason. As the :class:`.Session`
+ makes use of an :term:`identity map`, even though our SQL result set has two
+ rows with primary key 5, there is only one ``User(id=5)`` object inside the :class:`.Session`
+ which must be maintained uniquely on its identity, that is, its primary key /
+ class combination. It doesn't actually make much sense, if one is querying for
+ ``User()`` objects, to get the same object multiple times in the list. An
+ ordered set would potentially be a better representation of what :class:`.Query`
+ seeks to return when it returns full objects.
+
+The issue of :class:`.Query` deduplication remains problematic, mostly for the
+single reason that the :meth:`.Query.count` method is inconsistent, and the
+current status is that joined eager loading has in recent releases been
+superseded first by the "subquery eager loading" strategy and more recently the
+"select IN eager loading" strategy, both of which are generally more
+appropriate for collection eager loading. As this evolution continues,
+SQLAlchemy may alter this behavior on :class:`.Query`, which may also involve
+new APIs in order to more directly control this behavior, and may also alter
+the behavior of joined eager loading in order to create a more consistent usage
+pattern.
+
+
I've created a mapping against an Outer Join, and while the query returns rows, no objects are returned. Why not?
------------------------------------------------------------------------------------------------------------------
or ``CONTAINS`` function; its behavior will vary by backend and is not
available on some backends such as SQLite.
+.. _orm_tutorial_query_returning:
+
Returning Lists and Scalars
---------------------------
{stop}[<User(name='ed', fullname='Ed Jones', nickname='eddie')>,
<User(name='fred', fullname='Fred Flintstone', nickname='freddy')>]
+ .. warning::
+
+ When the :class:`.Query` object returns lists of ORM-mapped objects
+ such as the ``User`` object above, the entries are **deduplicated**
+ based on primary key, as the results are interpreted from the SQL
+ result set. That is, if SQL query returns a row with ``id=7`` twice,
+ you would only get a single ``User(id=7)`` object back in the result
+ list. This does not apply to the case when individual columns are
+ queried.
+
+ .. seealso::
+
+ :ref:`faq_query_deduplicating`
+
+
* :meth:`~.Query.first()` applies a limit of one and returns
the first result as a scalar:
self._suffixes = suffixes
def all(self):
- """Return the results represented by this ``Query`` as a list.
+ """Return the results represented by this :class:`.Query` as a list.
- This results in an execution of the underlying query.
+ This results in an execution of the underlying SQL statement.
+
+ .. warning:: The :class:`.Query` object, when asked to return either
+ a sequence or iterator that consists of full ORM-mapped entities,
+ will **deduplicate entries based on primary key**. See the FAQ for
+ more details.
+
+ .. seealso::
+ :ref:`faq_query_deduplicating`
"""
return list(self)
)
def count(self):
- r"""Return a count of rows this Query would return.
+ r"""Return a count of rows this the SQL formed by this :class:`Query`
+ would return.
This generates the SQL for this Query as follows::
SELECT <rest of query follows...>
) AS anon_1
- For fine grained control over specific columns
- to count, to skip the usage of a subquery or
- otherwise control of the FROM clause,
- or to use other aggregate functions,
- use :attr:`~sqlalchemy.sql.expression.func`
- expressions in conjunction
- with :meth:`~.Session.query`, i.e.::
+ The above SQL returns a single row, which is the aggregate value
+ of the count function; the :meth:`.Query.count` method then returns
+ that single integer value.
+
+ .. warning::
+
+ It is important to note that the value returned by
+ count() is **not the same as the number of ORM objects that this
+ Query would return from a method such as the .all() method**.
+ The :class:`.Query` object, when asked to return full entities,
+ will **deduplicate entries based on primary key**, meaning if the
+ same primary key value would appear in the results more than once,
+ only one object of that primary key would be present. This does
+ not apply to a query that is against individual columns.
+
+ .. seealso::
+
+ :ref:`faq_query_deduplicating`
+
+ :ref:`orm_tutorial_query_returning`
+
+ For fine grained control over specific columns to count, to skip the
+ usage of a subquery or otherwise control of the FROM clause, or to use
+ other aggregate functions, use :attr:`~sqlalchemy.sql.expression.func`
+ expressions in conjunction with :meth:`~.Session.query`, i.e.::
from sqlalchemy import func