the ``save-update`` cascade. For more details see the section
:ref:`unitofwork_cascades`.
+.. _unitofwork_merging:
+
Merging
-------
When given an instance, it follows these steps:
- * It examines the primary key of the instance. If it's present, it attempts
- to load an instance with that primary key (or pulls from the local
- identity map).
- * If there's no primary key on the given instance, or the given primary key
- does not exist in the database, a new instance is created.
- * The state of the given instance is then copied onto the located/newly
- created instance.
- * The operation is cascaded to associated child items along the ``merge``
- cascade. Note that all changes present on the given instance, including
- changes to collections, are merged.
- * The new instance is returned.
+* It examines the primary key of the instance. If it's present, it attempts
+ to load an instance with that primary key (or pulls from the local
+ identity map).
+* If there's no primary key on the given instance, or the given primary key
+ does not exist in the database, a new instance is created.
+* The state of the given instance is then copied onto the located/newly
+ created instance.
+* The operation is cascaded to associated child items along the ``merge``
+ cascade. Note that all changes present on the given instance, including
+ changes to collections, are merged.
+* The new instance is returned.
With :func:`~sqlalchemy.orm.session.Session.merge`, the given instance is not
placed within the session, and can be associated with a different session or
origins or current session associations and placing that state within a
session. Here's two examples:
- * An application which reads an object structure from a file and wishes to
- save it to the database might parse the file, build up the structure, and
- then use :func:`~sqlalchemy.orm.session.Session.merge` to save it to the
- database, ensuring that the data within the file is used to formulate the
- primary key of each element of the structure. Later, when the file has
- changed, the same process can be re-run, producing a slightly different
- object structure, which can then be ``merged`` in again, and the
- :class:`~sqlalchemy.orm.session.Session` will automatically update the
- database to reflect those changes.
- * A web application stores mapped entities within an HTTP session object.
- When each request starts up, the serialized data can be merged into the
- session, so that the original entity may be safely shared among requests
- and threads.
+* An application which reads an object structure from a file and wishes to
+ save it to the database might parse the file, build up the
+ structure, and then use
+ :func:`~sqlalchemy.orm.session.Session.merge` to save it
+ to the database, ensuring that the data within the file is
+ used to formulate the primary key of each element of the
+ structure. Later, when the file has changed, the same
+ process can be re-run, producing a slightly different
+ object structure, which can then be ``merged`` in again,
+ and the :class:`~sqlalchemy.orm.session.Session` will
+ automatically update the database to reflect those
+ changes.
+* A web application stores mapped entities within an HTTP session object.
+ When each request starts up, the serialized data can be
+ merged into the session, so that the original entity may
+ be safely shared among requests and threads.
:func:`~sqlalchemy.orm.session.Session.merge` is frequently used by
applications which implement their own second level caches. This refers to an
course possible that newer information in the database will not be present on
the merged object, since no load is issued.
+Merge Tips
+~~~~~~~~~~
+
+:meth:`~.Session.merge` is an extremely useful method for many purposes. However,
+it deals with the intricate border between objects that are transient/detached and
+those that are persistent, as well as the automated transferrence of state.
+The wide variety of scenarios that can present themselves here often require a
+more careful approach to the state of objects. Common problems with merge usually involve
+some unexpected state regarding the object being passed to :meth:`~.Session.merge`.
+
+Lets use the canonical example of the User and Address objects::
+
+ class User(Base):
+ __tablename__ = 'user'
+
+ id = Column(Integer, primary_key=True)
+ name = Column(String(50), nullable=False)
+ addresses = relationship("Address", backref="user")
+
+ class Address(Base):
+ __tablename__ = 'address'
+
+ id = Column(Integer, primary_key=True)
+ email_address = Column(String(50), nullable=False)
+ user_id = Column(Integer, ForeignKey('user.id'), nullable=False)
+
+Assume a ``User`` object with one ``Address``, already persistent::
+
+ >>> u1 = User(name='ed', addresses=[Address(email_address='ed@ed.com')])
+ >>> session.add(u1)
+ >>> session.commit()
+
+We now create ``a1``, an object outside the session, which we'd like
+to merge on top of the existing ``Address``::
+
+ >>> existing_a1 = u1.addresses[0]
+ >>> a1 = Address(id=existing_a1.id)
+
+A surprise would occur if we said this::
+
+ >>> a1.user = u1
+ >>> a1 = session.merge(a1)
+ >>> session.commit()
+ sqlalchemy.orm.exc.FlushError: New instance <Address at 0x1298f50>
+ with identity key (<class '__main__.Address'>, (1,)) conflicts with
+ persistent instance <Address at 0x12a25d0>
+
+Why is that ? We weren't careful with our cascades. The assignment
+of ``a1.user`` to a persistent object cascaded to the backref of ``User.addresses``
+and made our ``a1`` object pending, as though we had added it. Now we have
+*two* ``Address`` objects in the session::
+
+ >>> a1 = Address()
+ >>> a1.user = u1
+ >>> a1 in session
+ True
+ >>> existing_a1 in session
+ True
+ >>> a1 is existing_a1
+ False
+
+Above, our ``a1`` is already pending in the session. The
+subsequent :meth:`~.Session.merge` operation essentially
+does nothing. Cascade can be configured via the ``cascade``
+option on :func:`.relationship`, although in this case it
+would mean removing the ``save-update`` cascade from the
+``User.addresses`` relationship - and usually, that behavior
+is extremely convenient. The solution here would usually be to not assign
+``a1.user`` to an object already persistent in the target
+session.
+
+Note that a new :func:`.relationship` option introduced in 0.6.5,
+``cascade_backrefs=False``, will also prevent the ``Address`` from
+being added to the session via the ``a1.user = u1`` assignment.
+
+Further detail on cascade operation is at :ref:`unitofwork_cascades`.
+
+Another example of unexpected state::
+
+ >>> a1 = Address(id=existing_a1.id, user_id=u1.id)
+ >>> assert a1.user is None
+ >>> True
+ >>> a1 = session.merge(a1)
+ >>> session.commit()
+ sqlalchemy.exc.IntegrityError: (IntegrityError) address.user_id
+ may not be NULL
+
+Here, we accessed a1.user, which returned its default value
+of ``None``, which as a result of this access, has been placed in the ``__dict__`` of
+our object ``a1``. Normally, this operation creates no change event,
+so the ``user_id`` attribute takes precedence during a
+flush. But when we merge the ``Address`` object into the session, the operation
+is equivalent to::
+
+ >>> existing_a1.id = existing_a1.id
+ >>> existing_a1.user_id = u1.id
+ >>> existing_a1.user = None
+
+Where above, both ``user_id`` and ``user`` are assigned to, and change events
+are emitted for both. The ``user`` association
+takes precedence, and None is applied to ``user_id``, causing a failure.
+
+Most :meth:`~.Session.merge` issues can be examined by first checking -
+is the object prematurely in the session ?
+
+.. sourcecode:: python+sql
+
+ >>> a1 = Address(id=existing_a1, user_id=user.id)
+ >>> assert a1 not in session
+ >>> a1 = session.merge(a1)
+
+Or is there state on the object that we don't want ? Examining ``__dict__``
+is a quick way to check::
+
+ >>> a1 = Address(id=existing_a1, user_id=user.id)
+ >>> a1.user
+ >>> a1.__dict__
+ {'_sa_instance_state': <sqlalchemy.orm.state.InstanceState object at 0x1298d10>,
+ 'user_id': 1,
+ 'id': 1,
+ 'user': None}
+ >>> # we don't want user=None merged, remove it
+ >>> del a1.user
+ >>> a1 = session.merge(a1)
+ >>> # success
+ >>> session.commit()
+
Deleting
--------
The ``customer`` relationship specifies only the "save-update" cascade value,
indicating most operations will not be cascaded from a parent ``Order``
instance to a child ``User`` instance except for the
-:func:`~sqlalchemy.orm.session.Session.add` operation. "save-update" cascade
+:func:`~sqlalchemy.orm.session.Session.add` operation. ``save-update`` cascade
indicates that an :func:`~sqlalchemy.orm.session.Session.add` on the parent
will cascade to all child items, and also that items added to a parent which
is already present in a session will also be added to that same session.
The default value for ``cascade`` on :func:`~sqlalchemy.orm.relationship` is
``save-update, merge``.
+``save-update`` cascade also takes place on backrefs by default. This means
+that, given a mapping such as this::
+
+ mapper(Order, order_table, properties={
+ 'items' : relationship(Item, items_table, backref='order')
+ })
+
+If an ``Order`` is already in the session, and is assigned to the ``order``
+attribute of an ``Item``, the backref appends the ``Item`` to the ``orders``
+collection of that ``Order``, resulting in the ``save-update`` cascade taking
+place::
+
+ >>> o1 = Order()
+ >>> session.add(o1)
+ >>> o1 in session
+ True
+
+ >>> i1 = Item()
+ >>> i1.order = o1
+ >>> i1 in o1.orders
+ True
+ >>> i1 in session
+ True
+
+This behavior can be disabled as of 0.6.5 using the ``cascade_backrefs`` flag::
+
+ mapper(Order, order_table, properties={
+ 'items' : relationship(Item, items_table, backref='order',
+ cascade_backrefs=False)
+ })
+
+So above, the assignment of ``i1.order = o1`` will append ``i1`` to the ``orders``
+collection of ``o1``, but will not add ``i1`` to the session. You can of
+course :func:`~.Session.add` ``i1`` to the session at a later point. This option
+may be helpful for situations where an object needs to be kept out of a
+session until it's construction is completed, but still needs to be given
+associations to objects which are already persistent in the target session.
+
+
.. _unitofwork_transaction:
Managing Transactions