Behavioral Changes - ORM
========================
+.. _change_4519:
+
+Accessing an uninitialized collection attribute on a transient object no longer mutates __dict__
+-------------------------------------------------------------------------------------------------
+
+It has always been SQLAlchemy's behavior that accessing mapped attributes on a
+newly created object returns an implicitly generated value, rather than raising
+``AttributeError``, such as ``None`` for scalar attributes or ``[]`` for a
+list-holding relationship::
+
+ >>> u1 = User()
+ >>> u1.name
+ None
+ >>> u1.addresses
+ []
+
+The rationale for the above behavior was originally to make ORM objects easier
+to work with. Since an ORM object represents an empty row when first created
+without any state, it is intuitive that its un-accessed attributes would
+resolve to ``None`` (or SQL NULL) for scalars and to empty collections for
+relationships. In particular, it makes possible an extremely common pattern
+of being able to mutate the new collection without manually creating and
+assigning an empty collection first::
+
+ >>> u1 = User()
+ >>> u1.addresses.append(Address()) # no need to assign u1.addresses = []
+
+Up until version 1.0 of SQLAlchemy, the behavior of this initialization system
+for both scalar attributes as well as collections would be that the ``None`` or
+empty collection would be *populated* into the object's state, e.g.
+``__dict__``. This meant that the following two operations were equivalent::
+
+ >>> u1 = User()
+ >>> u1.name = None # explicit assignment
+
+ >>> u2 = User()
+ >>> u2.name # implicit assignment just by accessing it
+ None
+
+Where above, both ``u1`` and ``u2`` would have the value ``None`` populated
+in the value of the ``name`` attribute. Since this is a SQL NULL, the ORM
+would skip including these values within an INSERT so that SQL-level defaults
+take place, if any, else the value defaults to NULL on the database side.
+
+In version 1.0 as part of :ref:`migration_3061`, this behavior was refined so
+that the ``None`` value was no longer populated into ``__dict__``, only
+returned. Besides removing the mutating side effect of a getter operation,
+this change also made it possible to set columns that did have server defaults
+to the value NULL by actually assigning ``None``, which was now distinguished
+from just reading it.
+
+The change however did not accommodate for collections, where returning an
+empty collection that is not assigned meant that this mutable collection would
+be different each time and also would not be able to correctly accommodate for
+mutating operations (e.g. append, add, etc.) called upon it. While the
+behavior continued to generally not get in anyone's way, an edge case was
+eventually identified in :ticket:`4519` where this empty collection could be
+harmful, which is when the object is merged into a session::
+
+ >>> u1 = User(id=1) # create an empty User to merge with id=1 in the database
+ >>> merged1 = session.merge(u1) # value of merged1.addresses is unchanged from that of the DB
+
+ >>> u2 = User(id=2) # create an empty User to merge with id=2 in the database
+ >>> u2.addresses
+ []
+ >>> merged2 = session.merge(u2) # value of merged2.addresses has been emptied in the DB
+
+Above, the ``.addresses`` collection on ``merged1`` will contain all the
+``Address()`` objects that were already in the database. ``merged2`` will
+not; because it has an empty list implicitly assigned, the ``.addresses``
+collection will be erased. This is an example of where this mutating side
+effect can actually mutate the database itself.
+
+While it was considered that perhaps the attribute system should begin using
+strict "plain Python" behavior, raising ``AttributeError`` in all cases for
+non-existent attributes on non-persistent objects and requiring that all
+collections be explicitly assigned, such a change would likely be too extreme
+for the vast number of applications that have relied upon this behavior for
+many years, leading to a complex rollout / backwards compatibility problem as
+well as the likelihood that workarounds to restore the old behavior would
+become prevalent, thus rendering the whole change ineffective in any case.
+
+The change then is to keep the default producing behavior, but to finally make
+the non-mutating behavior of scalars a reality for collections as well, via the
+addition of additional mechanics in the collection system. When accessing the
+empty attribute, the new collection is created and associated with the state,
+however is not added to ``__dict__`` until it is actually mutated::
+
+ >>> u1 = User()
+ >>> l1 = u1.addresses # new list is created, associated with the state
+ >>> assert u1.addresses is l1 # you get the same list each time you access it
+ >>> assert "addresses" not in u1.__dict__ # but it won't go into __dict__ until it's mutated
+ >>> from sqlalchemy import inspect
+ >>> inspect(u1).attrs.addresses.history
+ History(added=None, unchanged=None, deleted=None)
+
+When the list is changed, then it becomes part of the tracked changes to
+be persisted to the database::
+
+ >>> l1.append(Address())
+ >>> assert "addresses" in u1.__dict__
+ >>> inspect(u1).attrs.addresses.history
+ History(added=[<__main__.Address object at 0x7f49b725eda0>], unchanged=[], deleted=[])
+
+This change is expected to have *nearly* no impact on existing applications
+in any way, except that it has been observed that some applications may be
+relying upon the implicit assignment of this collection, such as to assert that
+the object contains certain values based on its ``__dict__``::
+
+ >>> u1 = User()
+ >>> u1.addresses
+ []
+ # this will now fail, would pass before
+ >>> assert {k: v for k, v in u1.__dict__.items() if not k.startswith("_")} == {"addresses": []}
+
+or to ensure that the collection won't require a lazy load to proceed, the
+(admittedly awkward) code below will now also fail::
+
+ >>> u1 = User()
+ >>> u1.addresses
+ []
+ >>> s.add(u1)
+ >>> s.flush()
+ >>> s.close()
+ >>> u1.addresses # <-- will fail, .addresses is not loaded and object is detached
+
+Applications that rely upon the implicit mutating behavior of collections will
+need to be changed so that they assign the desired collection explicitly::
+
+ >>> u1.addresses = []
+
+:ticket:`4519`
+
.. _change_4662:
The "New instance conflicts with existing identity" error is now a warning
--- /dev/null
+.. change::
+ :tags: bug, orm
+ :tickets: 4519
+
+ Accessing a collection-oriented attribute on a newly created object no
+ longer mutates ``__dict__``, but still returns an empty collection as has
+ always been the case. This allows collection-oriented attributes to work
+ consistently in comparison to scalar attributes which return ``None``, but
+ also don't mutate ``__dict__``. In order to accommodate for the collection
+ being mutated, the same empty collection is returned each time once
+ initially created, and when it is mutated (e.g. an item appended, added,
+ etc.) it is then moved into ``__dict__``. This removes the last of
+ mutating side-effects on read-only attribute access within the ORM.
+
+ .. seealso::
+
+ :ref:`change_4519`
\ No newline at end of file
"""
raise NotImplementedError()
- def initialize(self, state, dict_):
- """Initialize the given state's attribute with an empty value."""
+ def _default_value(self, state, dict_):
+ """Produce an empty value for an uninitialized scalar attribute."""
value = None
for fn in self.dispatch.init_scalar:
if not passive & INIT_OK:
return NO_VALUE
else:
- # Return a new, empty value
- return self.initialize(state, dict_)
+ return self._default_value(state, dict_)
def append(self, state, dict_, value, initiator, passive=PASSIVE_OFF):
self.set(state, dict_, value, initiator, passive=passive)
# del is a no-op if collection not present.
del dict_[self.key]
- def initialize(self, state, dict_):
- """Initialize this attribute with an empty collection."""
+ def _default_value(self, state, dict_):
+ """Produce an empty collection for an un-initialized attribute"""
- _, user_data = self._initialize_collection(state)
- dict_[self.key] = user_data
+ if self.key in state._empty_collections:
+ return state._empty_collections[self.key]
+
+ adapter, user_data = self._initialize_collection(state)
+ adapter._set_empty(user_data)
return user_data
def _initialize_collection(self, state):
old = self.get(state, dict_, passive=PASSIVE_ONLY_PERSISTENT)
if old is PASSIVE_NO_RESULT:
- old = self.initialize(state, dict_)
+ old = self._default_value(state, dict_)
elif old is orig_iterable:
# ignore re-assignment of the current collection, as happens
# implicitly with in-place operators (foo.collection |= other)
@classmethod
def from_collection(cls, attribute, state, current):
original = state.committed_state.get(attribute.key, _NO_HISTORY)
-
if current is NO_VALUE:
return cls((), (), ())
"""Initialize a collection attribute and return the collection adapter."""
attr = state.manager[key].impl
- user_data = attr.initialize(state, dict_)
- return attr.get_collection(state, dict_, user_data)
+ user_data = attr._default_value(state, dict_)
+ adapter = attr.get_collection(state, dict_, user_data)
+ adapter._reset_empty()
+ return adapter
def set_committed_value(instance, key, value):
"owner_state",
"_converter",
"invalidated",
+ "empty",
)
def __init__(self, attr, owner_state, data):
data._sa_adapter = self
self._converter = data._sa_converter
self.invalidated = False
+ self.empty = False
def _warn_invalidated(self):
util.warn("This collection has been invalidated.")
self._data()._sa_appender(item, _sa_initiator=initiator)
+ def _set_empty(self, user_data):
+ assert (
+ not self.empty
+ ), "This collection adapter is alreay in the 'empty' state"
+ self.empty = True
+ self.owner_state._empty_collections[self._key] = user_data
+
+ def _reset_empty(self):
+ assert (
+ self.empty
+ ), "This collection adapter is not in the 'empty' state"
+ self.empty = False
+ self.owner_state.dict[
+ self._key
+ ] = self.owner_state._empty_collections.pop(self._key)
+
+ def _refuse_empty(self):
+ raise sa_exc.InvalidRequestError(
+ "This is a special 'empty' collection which cannot accommodate "
+ "internal mutation operations"
+ )
+
def append_without_event(self, item):
"""Add or restore an entity to the collection, firing no events."""
+
+ if self.empty:
+ self._refuse_empty()
self._data()._sa_appender(item, _sa_initiator=False)
def append_multiple_without_event(self, items):
"""Add or restore an entity to the collection, firing no events."""
+ if self.empty:
+ self._refuse_empty()
appender = self._data()._sa_appender
for item in items:
appender(item, _sa_initiator=False)
def remove_without_event(self, item):
"""Remove an entity from the collection, firing no events."""
+ if self.empty:
+ self._refuse_empty()
self._data()._sa_remover(item, _sa_initiator=False)
def clear_with_event(self, initiator=None):
"""Empty the collection, firing a mutation event for each entity."""
+ if self.empty:
+ self._refuse_empty()
remover = self._data()._sa_remover
for item in list(self):
remover(item, _sa_initiator=initiator)
def clear_without_event(self):
"""Empty the collection, firing no events."""
+ if self.empty:
+ self._refuse_empty()
remover = self._data()._sa_remover
for item in list(self):
remover(item, _sa_initiator=False)
if initiator is not False:
if self.invalidated:
self._warn_invalidated()
+
+ if self.empty:
+ self._reset_empty()
+
return self.attr.fire_append_event(
self.owner_state, self.owner_state.dict, item, initiator
)
if initiator is not False:
if self.invalidated:
self._warn_invalidated()
+
+ if self.empty:
+ self._reset_empty()
+
self.attr.fire_remove_event(
self.owner_state, self.owner_state.dict, item, initiator
)
"owner_cls": self.owner_state.class_,
"data": self.data,
"invalidated": self.invalidated,
+ "empty": self.empty,
}
def __setstate__(self, d):
d["data"]._sa_adapter = self
self.invalidated = d["invalidated"]
self.attr = getattr(d["owner_cls"], self._key).impl
+ self.empty = d.get("empty", False)
def bulk_replace(values, existing_adapter, new_adapter, initiator=None):
.. seealso::
+ :meth:`.AttributeEvents.init_collection` - collection version
+ of this event
+
:class:`.AttributeEvents` - background on listener options such
as propagation to subclasses.
:class:`.AttributeEvents` - background on listener options such
as propagation to subclasses.
+ :meth:`.AttributeEvents.init_scalar` - "scalar" version of this
+ event.
+
"""
def dispose_collection(self, target, collection, collection_adapter):
return
if self.uselist:
- instances = source_state.get_impl(self.key).get(
- source_state, source_dict
- )
- if hasattr(instances, "_sa_adapter"):
- # convert collections to adapters to get a true iterator
- instances = instances._sa_adapter
+ impl = source_state.get_impl(self.key)
+ instances_iterable = impl.get_collection(source_state, source_dict)
+
+ # if this is a CollectionAttributeImpl, then empty should
+ # be False, otherwise "self.key in source_dict" should not be
+ # True
+ assert not instances_iterable.empty if impl.collection else True
if load:
# for a full merge, pre-load the destination collection,
dest_state.get_impl(self.key).get(dest_state, dest_dict)
dest_list = []
- for current in instances:
+ for current in instances_iterable:
current_state = attributes.instance_state(current)
current_dict = attributes.instance_dict(current)
_recursive[(current_state, self)] = True
def _pending_mutations(self):
return {}
+ @util.memoized_property
+ def _empty_collections(self):
+ return {}
+
@util.memoized_property
def mapper(self):
"""Return the :class:`.Mapper` used for this mapped object."""
):
def invoke_no_load(state, dict_, row):
if self.uselist:
- state.manager.get_impl(self.key).initialize(state, dict_)
+ attributes.init_state_collection(state, dict_, self.key)
else:
dict_[self.key] = None
class Foo(object):
pass
+ class Bar(object):
+ pass
+
instrumentation.register_class(Foo)
+ instrumentation.register_class(Bar)
attributes.register_attribute(Foo, "bar", useobject=True, uselist=True)
event.listen(Foo.bar, "set", canary)
f1 = Foo()
eq_(f1.bar, [])
+
+ assert "bar" not in f1.__dict__
+
+ adapter = Foo.bar.impl.get_collection(
+ attributes.instance_state(f1), attributes.instance_dict(f1)
+ )
+ assert adapter.empty
+
# reversal of approach in #3061
eq_(canary.mock_calls, [])
+ f1.bar.append(Bar())
+ assert "bar" in f1.__dict__
+ assert not adapter.empty
+
class EventPropagateTest(fixtures.TestBase):
# tests that were expanded as of #4695
from sqlalchemy.testing import assert_raises_message
from sqlalchemy.testing import eq_
from sqlalchemy.testing import fixtures
+from sqlalchemy.testing import is_false
+from sqlalchemy.testing import is_true
from sqlalchemy.testing import ne_
from sqlalchemy.testing.schema import Column
from sqlalchemy.testing.schema import Table
adapter.remove_with_event(e1)
assert_eq()
+ self._test_empty_init(typecallable, creator=creator)
+
+ def _test_empty_init(self, typecallable, creator=None):
+ if creator is None:
+ creator = self.entity_maker
+
+ class Foo(object):
+ pass
+
+ instrumentation.register_class(Foo)
+ d = attributes.register_attribute(
+ Foo,
+ "attr",
+ uselist=True,
+ typecallable=typecallable,
+ useobject=True,
+ )
+
+ obj = Foo()
+ e1 = creator()
+ e2 = creator()
+ implicit_collection = obj.attr
+ is_true("attr" not in obj.__dict__)
+ adapter = collections.collection_adapter(implicit_collection)
+ is_true(adapter.empty)
+ assert_raises_message(
+ sa_exc.InvalidRequestError,
+ "This is a special 'empty'",
+ adapter.append_without_event,
+ e1,
+ )
+
+ adapter.append_with_event(e1)
+ is_false(adapter.empty)
+ is_true("attr" in obj.__dict__)
+ adapter.append_without_event(e2)
+ eq_(set(adapter), {e1, e2})
+
def _test_list(self, typecallable, creator=None):
if creator is None:
creator = self.entity_maker
eq_(insp.attrs.addresses.loaded_value, NO_VALUE)
# regular accessor sets it
eq_(insp.attrs.addresses.value, [])
- # now the None is there
- eq_(insp.attrs.addresses.loaded_value, [])
+ # stays as NO_VALUE, this is #4519
+ eq_(insp.attrs.addresses.loaded_value, NO_VALUE)
def test_instance_state_collection_attr_hist(self):
User = self.classes.User
eq_(hist.unchanged, None)
u1.addresses
hist = insp.attrs.addresses.history
- eq_(hist.unchanged, [])
+ # stays, this is #4519
+ eq_(hist.unchanged, None)
def test_instance_state_scalar_attr_hist(self):
User = self.classes.User
eq_(hist.unchanged, ())
u1.addresses
hist = insp.attrs.addresses.load_history()
- eq_(hist.unchanged, [])
+ # stays, this is #4519
+ eq_(hist.unchanged, ())
def test_instance_state_scalar_attr_hist_load(self):
User = self.classes.User
),
)
+ def test_transient_non_mutated_collection(self):
+ User, Address, addresses, users = (
+ self.classes.User,
+ self.classes.Address,
+ self.tables.addresses,
+ self.tables.users,
+ )
+
+ mapper(
+ User,
+ users,
+ properties={"addresses": relationship(Address, backref="user")},
+ )
+ mapper(Address, addresses)
+
+ s = Session()
+ u = User(
+ id=7,
+ name="fred",
+ addresses=[Address(id=1, email_address="jack@bean.com")],
+ )
+ s.add(u)
+ s.commit()
+ s.close()
+
+ u = User(id=7, name="fred")
+ # access address collection to get implicit blank collection
+ eq_(u.addresses, [])
+ u2 = s.merge(u)
+
+ # collection wasn't emptied
+ eq_(u2.addresses, [Address()])
+
def test_transient_to_pending_collection_pk_none(self):
User, Address, addresses, users = (
self.classes.User,