From: Mike Bayer Date: Wed, 22 Sep 2010 18:22:16 +0000 (-0400) Subject: - in depth docs about some merge() tips X-Git-Tag: rel_0_6_5~50 X-Git-Url: http://git.ipfire.org/cgi-bin/gitweb.cgi?a=commitdiff_plain;h=eae4de02a9b9bcf070a12607ada4098fb63e26f2;p=thirdparty%2Fsqlalchemy%2Fsqlalchemy.git - in depth docs about some merge() tips - docs about backref cascade - Another new flag on relationship(), cascade_backrefs, disables the "save-update" cascade when the event was initiated on the "reverse" side of a bidirectional relationship. This is a cleaner behavior so that many-to-ones can be set on a transient object without it getting sucked into the child object's session, while still allowing the forward collection to cascade. We *might* default this to False in 0.7. --- diff --git a/CHANGES b/CHANGES index a59b295707..d714977957 100644 --- a/CHANGES +++ b/CHANGES @@ -73,7 +73,16 @@ CHANGES object is loaded, so backrefs aren't available until after a flush. The flag is only intended for very specific use cases. - + + - Another new flag on relationship(), cascade_backrefs, + disables the "save-update" cascade when the event was + initiated on the "reverse" side of a bidirectional + relationship. This is a cleaner behavior so that + many-to-ones can be set on a transient object without + it getting sucked into the child object's session, + while still allowing the forward collection to + cascade. We *might* default this to False in 0.7. + - Slight improvement to the behavior of "passive_updates=False" when placed only on the many-to-one side of a relationship; documentation has diff --git a/doc/build/orm/session.rst b/doc/build/orm/session.rst index 7448392fe6..16ca7aff00 100644 --- a/doc/build/orm/session.rst +++ b/doc/build/orm/session.rst @@ -346,6 +346,8 @@ The :func:`~sqlalchemy.orm.session.Session.add` operation **cascades** along the ``save-update`` cascade. For more details see the section :ref:`unitofwork_cascades`. +.. _unitofwork_merging: + Merging ------- @@ -358,17 +360,17 @@ follows:: When given an instance, it follows these steps: - * It examines the primary key of the instance. If it's present, it attempts - to load an instance with that primary key (or pulls from the local - identity map). - * If there's no primary key on the given instance, or the given primary key - does not exist in the database, a new instance is created. - * The state of the given instance is then copied onto the located/newly - created instance. - * The operation is cascaded to associated child items along the ``merge`` - cascade. Note that all changes present on the given instance, including - changes to collections, are merged. - * The new instance is returned. +* It examines the primary key of the instance. If it's present, it attempts + to load an instance with that primary key (or pulls from the local + identity map). +* If there's no primary key on the given instance, or the given primary key + does not exist in the database, a new instance is created. +* The state of the given instance is then copied onto the located/newly + created instance. +* The operation is cascaded to associated child items along the ``merge`` + cascade. Note that all changes present on the given instance, including + changes to collections, are merged. +* The new instance is returned. With :func:`~sqlalchemy.orm.session.Session.merge`, the given instance is not placed within the session, and can be associated with a different session or @@ -377,19 +379,22 @@ taking the state of any kind of object structure without regard for its origins or current session associations and placing that state within a session. Here's two examples: - * An application which reads an object structure from a file and wishes to - save it to the database might parse the file, build up the structure, and - then use :func:`~sqlalchemy.orm.session.Session.merge` to save it to the - database, ensuring that the data within the file is used to formulate the - primary key of each element of the structure. Later, when the file has - changed, the same process can be re-run, producing a slightly different - object structure, which can then be ``merged`` in again, and the - :class:`~sqlalchemy.orm.session.Session` will automatically update the - database to reflect those changes. - * A web application stores mapped entities within an HTTP session object. - When each request starts up, the serialized data can be merged into the - session, so that the original entity may be safely shared among requests - and threads. +* An application which reads an object structure from a file and wishes to + save it to the database might parse the file, build up the + structure, and then use + :func:`~sqlalchemy.orm.session.Session.merge` to save it + to the database, ensuring that the data within the file is + used to formulate the primary key of each element of the + structure. Later, when the file has changed, the same + process can be re-run, producing a slightly different + object structure, which can then be ``merged`` in again, + and the :class:`~sqlalchemy.orm.session.Session` will + automatically update the database to reflect those + changes. +* A web application stores mapped entities within an HTTP session object. + When each request starts up, the serialized data can be + merged into the session, so that the original entity may + be safely shared among requests and threads. :func:`~sqlalchemy.orm.session.Session.merge` is frequently used by applications which implement their own second level caches. This refers to an @@ -406,6 +411,133 @@ all of its children may not contain any pending changes, and it's also of course possible that newer information in the database will not be present on the merged object, since no load is issued. +Merge Tips +~~~~~~~~~~ + +:meth:`~.Session.merge` is an extremely useful method for many purposes. However, +it deals with the intricate border between objects that are transient/detached and +those that are persistent, as well as the automated transferrence of state. +The wide variety of scenarios that can present themselves here often require a +more careful approach to the state of objects. Common problems with merge usually involve +some unexpected state regarding the object being passed to :meth:`~.Session.merge`. + +Lets use the canonical example of the User and Address objects:: + + class User(Base): + __tablename__ = 'user' + + id = Column(Integer, primary_key=True) + name = Column(String(50), nullable=False) + addresses = relationship("Address", backref="user") + + class Address(Base): + __tablename__ = 'address' + + id = Column(Integer, primary_key=True) + email_address = Column(String(50), nullable=False) + user_id = Column(Integer, ForeignKey('user.id'), nullable=False) + +Assume a ``User`` object with one ``Address``, already persistent:: + + >>> u1 = User(name='ed', addresses=[Address(email_address='ed@ed.com')]) + >>> session.add(u1) + >>> session.commit() + +We now create ``a1``, an object outside the session, which we'd like +to merge on top of the existing ``Address``:: + + >>> existing_a1 = u1.addresses[0] + >>> a1 = Address(id=existing_a1.id) + +A surprise would occur if we said this:: + + >>> a1.user = u1 + >>> a1 = session.merge(a1) + >>> session.commit() + sqlalchemy.orm.exc.FlushError: New instance
+ with identity key (, (1,)) conflicts with + persistent instance
+ +Why is that ? We weren't careful with our cascades. The assignment +of ``a1.user`` to a persistent object cascaded to the backref of ``User.addresses`` +and made our ``a1`` object pending, as though we had added it. Now we have +*two* ``Address`` objects in the session:: + + >>> a1 = Address() + >>> a1.user = u1 + >>> a1 in session + True + >>> existing_a1 in session + True + >>> a1 is existing_a1 + False + +Above, our ``a1`` is already pending in the session. The +subsequent :meth:`~.Session.merge` operation essentially +does nothing. Cascade can be configured via the ``cascade`` +option on :func:`.relationship`, although in this case it +would mean removing the ``save-update`` cascade from the +``User.addresses`` relationship - and usually, that behavior +is extremely convenient. The solution here would usually be to not assign +``a1.user`` to an object already persistent in the target +session. + +Note that a new :func:`.relationship` option introduced in 0.6.5, +``cascade_backrefs=False``, will also prevent the ``Address`` from +being added to the session via the ``a1.user = u1`` assignment. + +Further detail on cascade operation is at :ref:`unitofwork_cascades`. + +Another example of unexpected state:: + + >>> a1 = Address(id=existing_a1.id, user_id=u1.id) + >>> assert a1.user is None + >>> True + >>> a1 = session.merge(a1) + >>> session.commit() + sqlalchemy.exc.IntegrityError: (IntegrityError) address.user_id + may not be NULL + +Here, we accessed a1.user, which returned its default value +of ``None``, which as a result of this access, has been placed in the ``__dict__`` of +our object ``a1``. Normally, this operation creates no change event, +so the ``user_id`` attribute takes precedence during a +flush. But when we merge the ``Address`` object into the session, the operation +is equivalent to:: + + >>> existing_a1.id = existing_a1.id + >>> existing_a1.user_id = u1.id + >>> existing_a1.user = None + +Where above, both ``user_id`` and ``user`` are assigned to, and change events +are emitted for both. The ``user`` association +takes precedence, and None is applied to ``user_id``, causing a failure. + +Most :meth:`~.Session.merge` issues can be examined by first checking - +is the object prematurely in the session ? + +.. sourcecode:: python+sql + + >>> a1 = Address(id=existing_a1, user_id=user.id) + >>> assert a1 not in session + >>> a1 = session.merge(a1) + +Or is there state on the object that we don't want ? Examining ``__dict__`` +is a quick way to check:: + + >>> a1 = Address(id=existing_a1, user_id=user.id) + >>> a1.user + >>> a1.__dict__ + {'_sa_instance_state': , + 'user_id': 1, + 'id': 1, + 'user': None} + >>> # we don't want user=None merged, remove it + >>> del a1.user + >>> a1 = session.merge(a1) + >>> # success + >>> session.commit() + Deleting -------- @@ -729,7 +861,7 @@ relationship between an ``Order`` and an ``Item`` object. The ``customer`` relationship specifies only the "save-update" cascade value, indicating most operations will not be cascaded from a parent ``Order`` instance to a child ``User`` instance except for the -:func:`~sqlalchemy.orm.session.Session.add` operation. "save-update" cascade +:func:`~sqlalchemy.orm.session.Session.add` operation. ``save-update`` cascade indicates that an :func:`~sqlalchemy.orm.session.Session.add` on the parent will cascade to all child items, and also that items added to a parent which is already present in a session will also be added to that same session. @@ -752,6 +884,45 @@ objects to allow attachment to only one parent at a time. The default value for ``cascade`` on :func:`~sqlalchemy.orm.relationship` is ``save-update, merge``. +``save-update`` cascade also takes place on backrefs by default. This means +that, given a mapping such as this:: + + mapper(Order, order_table, properties={ + 'items' : relationship(Item, items_table, backref='order') + }) + +If an ``Order`` is already in the session, and is assigned to the ``order`` +attribute of an ``Item``, the backref appends the ``Item`` to the ``orders`` +collection of that ``Order``, resulting in the ``save-update`` cascade taking +place:: + + >>> o1 = Order() + >>> session.add(o1) + >>> o1 in session + True + + >>> i1 = Item() + >>> i1.order = o1 + >>> i1 in o1.orders + True + >>> i1 in session + True + +This behavior can be disabled as of 0.6.5 using the ``cascade_backrefs`` flag:: + + mapper(Order, order_table, properties={ + 'items' : relationship(Item, items_table, backref='order', + cascade_backrefs=False) + }) + +So above, the assignment of ``i1.order = o1`` will append ``i1`` to the ``orders`` +collection of ``o1``, but will not add ``i1`` to the session. You can of +course :func:`~.Session.add` ``i1`` to the session at a later point. This option +may be helpful for situations where an object needs to be kept out of a +session until it's construction is completed, but still needs to be given +associations to objects which are already persistent in the target session. + + .. _unitofwork_transaction: Managing Transactions diff --git a/lib/sqlalchemy/orm/__init__.py b/lib/sqlalchemy/orm/__init__.py index 39c68f0aaf..8b32d1a273 100644 --- a/lib/sqlalchemy/orm/__init__.py +++ b/lib/sqlalchemy/orm/__init__.py @@ -255,7 +255,19 @@ def relationship(argument, secondary=None, **kwargs): * ``all`` - shorthand for "save-update,merge, refresh-expire, expunge, delete" - + + :param cascade_backrefs=True: + a boolean value indicating if the ``save-update`` cascade should + operate along a backref event. When set to ``False`` on a + one-to-many relationship that has a many-to-one backref, assigning + a persistent object to the many-to-one attribute on a transient object + will not add the transient to the session. Similarly, when + set to ``False`` on a many-to-one relationship that has a one-to-many + backref, appending a persistent object to the one-to-many collection + on a transient object will not add the transient to the session. + + ``cascade_backrefs`` is new in 0.6.5. + :param collection_class: a class or callable that returns a new list-holding object. will be used in place of a plain list for storing elements. diff --git a/lib/sqlalchemy/orm/properties.py b/lib/sqlalchemy/orm/properties.py index 80443a7f36..a5e6930b28 100644 --- a/lib/sqlalchemy/orm/properties.py +++ b/lib/sqlalchemy/orm/properties.py @@ -444,8 +444,10 @@ class RelationshipProperty(StrategizedProperty): comparator_factory=None, single_parent=False, innerjoin=False, doc=None, + cascade_backrefs=True, load_on_pending=False, - strategy_class=None, _local_remote_pairs=None, query_class=None): + strategy_class=None, _local_remote_pairs=None, + query_class=None): self.uselist = uselist self.argument = argument @@ -460,6 +462,7 @@ class RelationshipProperty(StrategizedProperty): self._user_defined_foreign_keys = foreign_keys self.collection_class = collection_class self.passive_deletes = passive_deletes + self.cascade_backrefs = cascade_backrefs self.passive_updates = passive_updates self.remote_side = remote_side self.enable_typechecks = enable_typechecks diff --git a/lib/sqlalchemy/orm/session.py b/lib/sqlalchemy/orm/session.py index bab98f4faf..d97ef87e76 100644 --- a/lib/sqlalchemy/orm/session.py +++ b/lib/sqlalchemy/orm/session.py @@ -1153,6 +1153,8 @@ class Session(object): This operation cascades to associated instances if the association is mapped with ``cascade="merge"``. + See :ref:`unitofwork_merging` for a detailed discussion of merging. + """ if 'dont_load' in kw: load = not kw['dont_load'] diff --git a/lib/sqlalchemy/orm/unitofwork.py b/lib/sqlalchemy/orm/unitofwork.py index 830ac3c0c8..a9808e6ba6 100644 --- a/lib/sqlalchemy/orm/unitofwork.py +++ b/lib/sqlalchemy/orm/unitofwork.py @@ -33,10 +33,13 @@ class UOWEventHandler(interfaces.AttributeExtension): def append(self, state, item, initiator): # process "save_update" cascade rules for when # an instance is appended to the list of another instance + sess = _state_session(state) if sess: prop = _state_mapper(state).get_property(self.key) - if prop.cascade.save_update and item not in sess: + if prop.cascade.save_update and \ + (prop.cascade_backrefs or self.key == initiator.key) and \ + item not in sess: sess.add(item) return item @@ -55,11 +58,13 @@ class UOWEventHandler(interfaces.AttributeExtension): # is attached to another instance if oldvalue is newvalue: return newvalue + sess = _state_session(state) if sess: prop = _state_mapper(state).get_property(self.key) if newvalue is not None and \ prop.cascade.save_update and \ + (prop.cascade_backrefs or self.key == initiator.key) and \ newvalue not in sess: sess.add(newvalue) if prop.cascade.delete_orphan and \ diff --git a/test/orm/inheritance/test_magazine.py b/test/orm/inheritance/test_magazine.py index 36ac7f9192..125a5629c6 100644 --- a/test/orm/inheritance/test_magazine.py +++ b/test/orm/inheritance/test_magazine.py @@ -187,17 +187,18 @@ def generate_round_trip_test(use_unions=False, use_joins=False): pub = Publication(name='Test') issue = Issue(issue=46,publication=pub) - location = Location(ref='ABC',name='London',issue=issue) page_size = PageSize(name='A4',width=210,height=297) magazine = Magazine(location=location,size=page_size) + page = ClassifiedPage(magazine=magazine,page_no=1) page2 = MagazinePage(magazine=magazine,page_no=2) page3 = ClassifiedPage(magazine=magazine,page_no=3) session.add(pub) + session.flush() print [x for x in session] session.expunge_all() diff --git a/test/orm/test_cascade.py b/test/orm/test_cascade.py index 1c935df139..75b9e22ec1 100644 --- a/test/orm/test_cascade.py +++ b/test/orm/test_cascade.py @@ -4,7 +4,7 @@ from sqlalchemy import Integer, String, ForeignKey, Sequence, \ exc as sa_exc from sqlalchemy.test.schema import Table, Column from sqlalchemy.orm import mapper, relationship, create_session, \ - sessionmaker, class_mapper, backref + sessionmaker, class_mapper, backref, Session from sqlalchemy.orm import attributes, exc as orm_exc from sqlalchemy.test import testing from sqlalchemy.test.testing import eq_ @@ -939,6 +939,67 @@ class M2MCascadeTest(_base.MappedTest): assert b1 not in a1.bs assert b1 in a2.bs +class NoBackrefCascadeTest(_fixtures.FixtureTest): + run_inserts = None + + @classmethod + @testing.resolve_artifact_names + def setup_mappers(cls): + mapper(Address, addresses) + mapper(User, users, properties={ + 'addresses':relationship(Address, backref='user', + cascade_backrefs=False) + }) + + mapper(Dingaling, dingalings, properties={ + 'address' : relationship(Address, backref='dingalings', + cascade_backrefs=False) + }) + + @testing.resolve_artifact_names + def test_o2m(self): + sess = Session() + + u1 = User(name='u1') + sess.add(u1) + + a1 = Address(email_address='a1') + a1.user = u1 + assert a1 not in sess + + sess.commit() + + assert a1 not in sess + + sess.add(a1) + + d1 = Dingaling() + d1.address = a1 + assert d1 in a1.dingalings + assert d1 in sess + + sess.commit() + + @testing.resolve_artifact_names + def test_m2o(self): + sess = Session() + + a1 = Address(email_address='a1') + d1 = Dingaling() + sess.add(d1) + + a1.dingalings.append(d1) + assert a1 not in sess + + a2 = Address(email_address='a2') + sess.add(a2) + + u1 = User(name='u1') + u1.addresses.append(a2) + assert u1 in sess + + sess.commit() + class UnsavedOrphansTest(_base.MappedTest): """Pending entities that are orphans"""