From: Mike Bayer Date: Sat, 25 Jun 2011 19:48:57 +0000 (-0400) Subject: - new section on backrefs X-Git-Tag: rel_0_7_2~50 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=dc5d56d219262a766293df0aedd422550d851be8;p=thirdparty%2Fsqlalchemy%2Fsqlalchemy.git - new section on backrefs - new section on m2m self referential - start illustrating things more in terms of declarative primarily --- diff --git a/doc/build/orm/relationships.rst b/doc/build/orm/relationships.rst index 682b83d5f1..3fbafd23c5 100644 --- a/doc/build/orm/relationships.rst +++ b/doc/build/orm/relationships.rst @@ -535,6 +535,268 @@ depth setting is configured via ``join_depth``: FROM nodes LEFT OUTER JOIN nodes AS nodes_2 ON nodes.id = nodes_2.parent_id LEFT OUTER JOIN nodes AS nodes_1 ON nodes_2.id = nodes_1.parent_id [] +Linking relationships with Backref +---------------------------------- + +The ``backref`` keyword argument was first introduced in :ref:`ormtutorial_toplevel`, and has been +mentioned throughout many of the examples here. What does it actually do ? Let's start +with the canonical ``User`` and ``Address`` scenario:: + + from sqlalchemy import Integer, ForeignKey, String, Column + from sqlalchemy.ext.declarative import declarative_base + from sqlalchemy.orm import relationship + + Base = declarative_base() + + class User(Base): + __tablename__ = 'user' + id = Column(Integer, primary_key=True) + name = Column(String) + + addresses = relationship("Address", backref="user") + + class Address(Base): + __tablename__ = 'address' + id = Column(Integer, primary_key=True) + email = Column(String) + user_id = Column(Integer, ForeignKey('user.id')) + +The above configuration establishes a collection of ``Address`` objects on ``User`` called +``User.addresses``. It also establishes a ``.user`` attribute on ``Address`` which will +refer to the parent ``User`` object. + +In fact, the ``backref`` keyword is only a common shortcut for placing a second +``relationship`` onto the ``Address`` mapping, including the establishment +of an event listener on both sides which will mirror attribute operations +in both directions. The above configuration is equivalent to:: + + from sqlalchemy import Integer, ForeignKey, String, Column + from sqlalchemy.ext.declarative import declarative_base + from sqlalchemy.orm import relationship + + Base = declarative_base() + + class User(Base): + __tablename__ = 'user' + id = Column(Integer, primary_key=True) + name = Column(String) + + addresses = relationship("Address", back_populates="user") + + class Address(Base): + __tablename__ = 'address' + id = Column(Integer, primary_key=True) + email = Column(String) + user_id = Column(Integer, ForeignKey('user.id')) + + user = relationship("User", back_populates="addresses") + +Above, we add a ``.user`` relationship to ``Address`` explicitly. On +both relationships, the ``back_populates`` directive tells each relationship +about the other one, indicating that they should establish "bi-directional" +behavior between each other. The primary effect of this configuration +is that the relationship adds event handlers to both attributes +which have the behavior of "when an append or set event occurs here, set ourselves +onto the incoming attribute using this particular attribute name". +The behavior is illustrated as follows. Start with a ``User`` and an ``Address`` +instance. The ``.addresses`` collection is empty, and the ``.user`` attribute +is ``None``:: + + >>> u1 = User() + >>> a1 = Address() + >>> u1.addresses + [] + >>> print a1.user + None + +However, once the ``Address`` is appended to the ``u1.addresses`` collection, +both the collection and the scalar attribute have been populated:: + + >>> u1.addresses.append(a1) + >>> u1.addresses + [<__main__.Address object at 0x12a6ed0>] + >>> a1.user + <__main__.User object at 0x12a6590> + +This behavior of course works in reverse for removal operations as well, as well +as for equivalent operations on both sides. Such as +when ``.user`` is set again to ``None``, the ``Address`` object is removed +from the reverse collection:: + + >>> a1.user = None + >>> u1.addresses + [] + +The manipulation of the ``.addresses`` collection and the ``.user`` attribute +occurs entirely in Python without any interaction with the SQL database. +Without this behavior, the proper state would be apparent on both sides once the +data has been flushed to the database, and later reloaded after a commit or +expiration operation occurs. The ``backref``/``back_populates`` behavior has the advantage +that common bidirectional operations can reflect the correct state without requiring +a database round trip. + +Remember, when the ``backref`` keyword is used on a single relationship, it's +exactly the same as if the above two relationships were created individually +using ``back_populates`` on each. + +Backref Arguments +~~~~~~~~~~~~~~~~~~ + +We've established that the ``backref`` keyword is merely a shortcut for building +two individual :func:`.relationship` constructs that refer to each other. Part of +the behavior of this shortcut is that certain configurational arguments applied to +the :func:`.relationship` +will also be applied to the other direction - namely those arguments that describe +the relationship at a schema level, and are unlikely to be different in the reverse +direction. The usual case +here is a many-to-many :func:`.relationship` that has a ``secondary`` argument, +or a one-to-many or many-to-one which has a ``primaryjoin`` argument (the +``primaryjoin`` argument is discussed in :ref:`relationship_primaryjoin`). Such +as if we limited the list of ``Address`` objects to those which start with "tony":: + + from sqlalchemy import Integer, ForeignKey, String, Column + from sqlalchemy.ext.declarative import declarative_base + from sqlalchemy.orm import relationship + + Base = declarative_base() + + class User(Base): + __tablename__ = 'user' + id = Column(Integer, primary_key=True) + name = Column(String) + + addresses = relationship("Address", + primaryjoin="and_(User.id==Address.user_id, " + "Address.email.startswith('tony'))", + backref="user") + + class Address(Base): + __tablename__ = 'address' + id = Column(Integer, primary_key=True) + email = Column(String) + user_id = Column(Integer, ForeignKey('user.id')) + +We can observe, by inspecting the resulting property, that both sides +of the relationship have this join condition applied:: + + >>> print User.addresses.property.primaryjoin + "user".id = address.user_id AND address.email LIKE :email_1 || '%%' + >>> + >>> print Address.user.property.primaryjoin + "user".id = address.user_id AND address.email LIKE :email_1 || '%%' + >>> + +This reuse of arguments should pretty much do the "right thing" - it uses +only arguments that are applicable, and in the case of a many-to-many +relationship, will reverse the usage of ``primaryjoin`` and ``secondaryjoin`` +to correspond to the other direction (see the example in :ref:`self_referential_many_to_many` +for this). + +It's very often the case however that we'd like to specify arguments that +are specific to just the side where we happened to place the "backref". +This includes :func:`.relationship` arguments like ``lazy``, ``remote_side``, +``cascade`` and ``cascade_backrefs``. For this case we use the :func:`.backref` +function in place of a string:: + + # + from sqlalchemy.orm import backref + + class User(Base): + __tablename__ = 'user' + id = Column(Integer, primary_key=True) + name = Column(String) + + addresses = relationship("Address", + backref=backref("user", lazy="joined")) + +Where above, we placed a ``lazy="joined"`` directive only on the ``Address.user`` +side, indicating that when a query against ``Address`` is made, a join to the ``User`` +entity should be made automatically which will populate the ``.user`` attribute of each +returned ``Address``. The :func:`.backref` function formatted the arguments we gave +it into a form that is interpreted by the receiving :func:`.relationship` as additional +arguments to be applied to the new relationship it creates. + +One Way Backrefs +~~~~~~~~~~~~~~~~~ + +An unusual case is that of the "one way backref". This is where the "back-populating" +behavior of the backref is only desirable in one direction. An example of this +is a collection which contains a filtering ``primaryjoin`` condition. We'd like to append +items to this collection as needed, and have them populate the "parent" object on the +incoming object. However, we'd also like to have items that are not part of the collection, +but still have the same "parent" association - these items should never be in the +collection. + +Taking our previous example, where we established a ``primaryjoin`` that limited the +collection only to ``Address`` objects whose email address started with the word ``tony``, +the usual backref behavior is that all items populate in both directions. We wouldn't +want this behavior for a case like the following:: + + >>> u1 = User() + >>> a1 = Address(email='mary') + >>> a1.user = u1 + >>> u1.addresses + [<__main__.Address object at 0x1411910>] + +Above, the ``Address`` object that doesn't match the criterion of "starts with 'tony'" +is present in the ``addresses`` collection of ``u1``. After these objects are flushed, +the transaction committed and their attributes expired for a re-load, the ``addresses`` +collection will hit the database on next access and no longer have this ``Address`` object +present, due to the filtering condition. But we can do away with this unwanted side +of the "backref" behavior on the Python side by using two separate :func:`.relationship` constructs, +placing ``back_populates`` only on one side:: + + from sqlalchemy import Integer, ForeignKey, String, Column + from sqlalchemy.ext.declarative import declarative_base + from sqlalchemy.orm import relationship + + Base = declarative_base() + + class User(Base): + __tablename__ = 'user' + id = Column(Integer, primary_key=True) + name = Column(String) + addresses = relationship("Address", + primaryjoin="and_(User.id==Address.user_id, " + "Address.email.startswith('tony'))", + back_populates="user") + + class Address(Base): + __tablename__ = 'address' + id = Column(Integer, primary_key=True) + email = Column(String) + user_id = Column(Integer, ForeignKey('user.id')) + user = relationship("User") + +With the above scenario, appending an ``Address`` object to the ``.addresses`` +collection of a ``User`` will always establish the ``.user`` attribute on that +``Address``:: + + >>> u1 = User() + >>> a1 = Address(email='tony') + >>> u1.addresses.append(a1) + >>> a1.user + <__main__.User object at 0x1411850> + +However, applying a ``User`` to the ``.user`` attribute of an ``Address``, +will not append the ``Address`` object to the collection:: + + >>> a2 = Address(email='mary') + >>> a2.user = u1 + >>> a2 in u1.addresses + False + +Of course, we've disabled some of the usefulness of ``backref`` here, in that +when we do append an ``Address`` that corresponds to the criteria of ``email.startswith('tony')``, +it won't show up in the ``User.addresses`` collection until the session is flushed, +and the attributes reloaded after a commit or expire operation. While we could +consider an attribute event that checks this criterion in Python, this starts +to cross the line of duplicating too much SQL behavior in Python. The backref behavior +itself is only a slight transgression of this philosophy - SQLAlchemy tries to keep +these to a minimum overall. + +.. _relationship_primaryjoin: + Specifying Alternate Join Conditions to relationship() ------------------------------------------------------ @@ -558,16 +820,60 @@ relationship it also formulates the **secondary join condition**:: secondary_table.c.child_id == child_table.c.id --> child_table secondaryjoin -If you are working with a :class:`~sqlalchemy.schema.Table` which has no -:class:`~sqlalchemy.schema.ForeignKey` objects on it (which can be the case +If you are working with a :class:`.Table` which has no +:class:`.ForeignKey` metadata established (which can be the case when using reflected tables with MySQL), or if the join condition cannot be -expressed by a simple foreign key relationship, use the ``primaryjoin`` and -possibly ``secondaryjoin`` conditions to create the appropriate relationship. +expressed by a simple foreign key relationship, use the ``primaryjoin``, and +for many-to-many relationships ``secondaryjoin``, directives +to create the appropriate relationship. -In this example we create a relationship ``boston_addresses`` which will only -load the user addresses with a city of "Boston": +In this example, using the ``User`` class as well as an ``Address`` class +which stores a street address, we create a relationship ``boston_addresses`` which will only +load those ``Address`` objects which specify a city of "Boston":: -.. sourcecode:: python+sql + from sqlalchemy import Integer, ForeignKey, String, Column + from sqlalchemy.ext.declarative import declarative_base + from sqlalchemy.orm import relationship + + Base = declarative_base() + + class User(Base): + __tablename__ = 'user' + id = Column(Integer, primary_key=True) + name = Column(String) + addresses = relationship("Address", + primaryjoin="and_(User.id==Address.user_id, " + "Address.city=='Boston')") + + class Address(Base): + __tablename__ = 'address' + id = Column(Integer, primary_key=True) + user_id = Column(Integer, ForeignKey('user.id')) + + street = Column(String) + city = Column(String) + state = Column(String) + zip = Column(String) + +Note above we specified the ``primaryjoin`` argument as a string - this feature +is available only when the mapping is constructed using the Declarative extension, +and allows us to specify a full SQL expression +between two entities before those entities have been fully constructed. When +all mappings have been defined, an automatic "mapper configuration" step interprets +these string arguments when first needed. + +Within this string SQL expression, we also made usage of the :func:`.and_` conjunction construct to establish +two distinct predicates for the join condition - joining both the ``User.id`` and +``Address.user_id`` columns to each other, as well as limiting rows in ``Address`` +to just ``city='Boston'``. When using Declarative, rudimentary SQL functions like +:func:`.and_` are automatically available in the evaulated namespace of a string +:func:`.relationship` argument. + +When using classical mappings, we have the advantage of the :class:`.Table` objects +already being present when the mapping is defined, so that the SQL expression +can be created immediately:: + + from sqlalchemy.orm import relationship, mapper class User(object): pass @@ -577,27 +883,94 @@ load the user addresses with a city of "Boston": mapper(Address, addresses_table) mapper(User, users_table, properties={ 'boston_addresses': relationship(Address, primaryjoin= - and_(users_table.c.user_id==addresses_table.c.user_id, + and_(users_table.c.id==addresses_table.c.user_id, addresses_table.c.city=='Boston')) }) +Note that the custom criteria we use in a ``primaryjoin`` is generally only significant +when SQLAlchemy is rendering SQL in order to load or represent this relationship. +That is, it's used +in the SQL statement that's emitted in order to perform a per-attribute lazy load, or when a join is +constructed at query time, such as via :meth:`.Query.join`, or via the eager "joined" or "subquery" +styles of loading. When in-memory objects are being manipulated, we can place any ``Address`` object +we'd like into the ``boston_addresses`` collection, regardless of what the value of the ``.city`` +attribute is. The objects will remain present in the collection until the attribute is expired +and re-loaded from the database where the criterion is applied. When +a flush occurs, the objects inside of ``boston_addresses`` will be flushed unconditionally, assigning +value of the primary key ``user.id`` column onto the foreign-key-holding ``address.user_id`` column +for each row. The ``city`` criteria has no effect here, as the flush process only cares about synchronizing primary +key values into referencing foreign key values. + +.. _self_referential_many_to_many: + +Self-Referential Many-to-Many Relationship +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Many to many relationships can be customized by one or both of ``primaryjoin`` -and ``secondaryjoin``, shown below with just the default many-to-many -relationship explicitly set: +and ``secondaryjoin``. A common situation for custom primary and secondary joins +is when establishing a many-to-many relationship from a class to itself, as shown below:: -.. sourcecode:: python+sql + from sqlalchemy import Integer, ForeignKey, String, Column, Table + from sqlalchemy.ext.declarative import declarative_base + from sqlalchemy.orm import relationship - class User(object): - pass - class Keyword(object): + Base = declarative_base() + + node_to_node = Table("node_to_node", Base.metadata, + Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True), + Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True) + ) + + class Node(Base): + __tablename__ = 'node' + id = Column(Integer, primary_key=True) + label = Column(String) + right_nodes = relationship("Node", + secondary=node_to_node, + primaryjoin=id==node_to_node.c.left_node_id, + secondaryjoin=id==node_to_node.c.right_node_id, + backref="left_nodes" + ) + +Where above, SQLAlchemy can't know automatically which columns should connect +to which for the ``right_nodes`` and ``left_nodes`` relationships. The ``primaryjoin`` +and ``secondaryjoin`` arguments establish how we'd like to join to the association table. +In the Declarative form above, as we are declaring these conditions within the Python +block that corresponds to the ``Node`` class, the ``id`` variable is available directly +as the ``Column`` object we wish to join with. + +A classical mapping situation here is similar, where ``node_to_node`` can be joined +to ``node.c.id``:: + + from sqlalchemy import Integer, ForeignKey, String, Column, Table, MetaData + from sqlalchemy.orm import relationship, mapper + + metadata = MetaData() + + node_to_node = Table("node_to_node", metadata, + Column("left_node_id", Integer, ForeignKey("node.id"), primary_key=True), + Column("right_node_id", Integer, ForeignKey("node.id"), primary_key=True) + ) + + node = Table("node", metadata, + Column('id', Integer, primary_key=True), + Column('label', String) + ) + class Node(object): pass - mapper(Keyword, keywords_table) - mapper(User, users_table, properties={ - 'keywords': relationship(Keyword, secondary=userkeywords_table, - primaryjoin=users_table.c.user_id==userkeywords_table.c.user_id, - secondaryjoin=userkeywords_table.c.keyword_id==keywords_table.c.keyword_id - ) - }) + + mapper(Node, node, properties={ + 'right_nodes':relationship(Node, + secondary=node_to_node, + primaryjoin=node.c.id==node_to_node.c.left_node_id, + secondaryjoin=node.c.id==node_to_node.c.right_node_id, + backref="left_nodes" + )}) + + +Note that in both examples, the ``backref`` keyword specifies a ``left_nodes`` +backref - when :func:`.relationship` creates the second relationship in the reverse +direction, it's smart enough to reverse the ``primaryjoin`` and ``secondaryjoin`` arguments. Specifying Foreign Keys ~~~~~~~~~~~~~~~~~~~~~~~~