Describing Databases with MetaData
==================================
-The core of SQLAlchemy's query and object mapping operations are supported by **database metadata**, which is comprised of Python objects that describe tables and other schema-level objects. These objects can be created by explicitly naming the various components and their properties, using the Table, Column, ForeignKey, Index, and Sequence objects imported from ``sqlalchemy.schema``. There is also support for **reflection** of some entities, which means you only specify the *name* of the entities and they are recreated from the database automatically.
+The core of SQLAlchemy's query and object mapping operations are supported by *database metadata*, which is comprised of Python objects that describe tables and other schema-level objects. These objects are at the core of three major types of operations - issuing CREATE and DROP statements (known as *DDL*), constructing SQL queries, and expressing information about structures that already exist within the database.
+
+Database metadata can be expressed by explicitly naming the various components and their properties, using constructs such as ``Table``, ``Column``, ``ForeignKey`` and ``Sequence``, all of which are imported from the ``sqlalchemy.schema`` package. It can also be generated by SQLAlchemy using a process called *reflection*, which means you start with a single object such as ``Table``, assign it a name, and then instruct SQLAlchemy to load all the additional information related to that name from a particular engine source.
+
+A key feature of SQLAlchemy's database metadata constructs is that they are designed to be used in a *declarative* style which closely resembles that of real DDL. They are therefore most intuitive to those who have some background in creating real schema generation scripts.
A collection of metadata entities is stored in an object aptly named ``MetaData``::
metadata = MetaData()
-To represent a Table, use the ``Table`` class::
+``MetaData`` is a container object that keeps together many different features of a database (or multiple databases) being described.
+
+To represent a table, use the ``Table`` class. Its two primary arguments are the table name, then the ``MetaData`` object which it will be associated with. The remaining positional arguments are mostly ``Column`` objects describing each column::
- users = Table('users', metadata,
+ user = Table('user', metadata,
Column('user_id', Integer, primary_key = True),
Column('user_name', String(16), nullable = False),
Column('email_address', String(60), key='email'),
Column('password', String(20), nullable = False)
)
-
- user_prefs = Table('user_prefs', metadata,
+
+Above, a table called ``user`` is described, which contains four columns. The primary key of the table consists of the ``user_id`` column. Multiple columns may be assigned the ``primary_key=True`` flag which denotes a multi-column primary key, known as a *composite* primary key.
+
+Note also that each column describes its datatype using objects corresponding to genericized types, such as ``Integer`` and ``String``. SQLAlchemy features dozens of types of varying levels of specificity as well as the ability to create custom types. Documentation on the type system can be found at :ref:`types`.
+
+.. _metadata_foreignkeys:
+
+Defining Foreign Keys
+---------------------
+
+A *foreign key* in SQL is a table-level construct that constrains one or more columns in that table to only allow values that are present in a different set of columns, typically but not always located on a different table. We call the columns which are constrained the *foreign key* columns and the columns which they are constrained towards the *referenced* columns. The referenced columns almost always define the primary key for their owning table, though there are exceptions to this. The foreign key is the "joint" that connects together pairs of rows which have a relationship with each other, and SQLAlchemy assigns very deep importance to this concept in virtually every area of its operation.
+
+In SQLAlchemy as well as in DDL, foreign key constraints can be defined as additional attributes within the table clause, or for single-column foreign keys they may optionally be specified within the definition of a single column. The single column foreign key is more common, and at the column level is specified by constructing a ``ForeignKey`` object as an argument to a ``Column`` object::
+
+ user_preference = Table('user_preference', metadata,
Column('pref_id', Integer, primary_key=True),
- Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
+ Column('user_id', Integer, ForeignKey("user.user_id"), nullable=False),
Column('pref_name', String(40), nullable=False),
Column('pref_value', String(100))
)
-The specific datatypes for each Column, such as Integer, String, etc. are described in `types`, and exist within the module ``sqlalchemy.types`` as well as the global ``sqlalchemy`` namespace.
+Above, we define a new table ``user_preference`` for which each row must contain a value in the ``user_id`` column that also exists in the ``user`` table's ``user_id`` column.
-.. _metadata_foreignkeys:
+The argument to ``ForeignKey`` is most commonly a string of the form *<tablename>.<columnname>*, or for a table in a remote schema or "owner" of the form *<schemaname>.<tablename>.<columnname>*. It may also be an actual ``Column`` object, which as we'll see later is accessed from an existing ``Table`` object via its ``c`` collection::
-Defining Foreign Keys
----------------------
+ ForeignKey(user.c.user_id)
-Foreign keys are most easily specified by the ``ForeignKey`` object within a ``Column`` object. For a composite foreign key, i.e. a foreign key that contains multiple columns referencing multiple columns to a composite primary key, an explicit syntax is provided which allows the correct table CREATE statements to be generated::
+The advantage to using a string is that the in-python linkage between ``user`` and ``user_preference`` is resolved only when first needed, so that table objects can be easily spread across multiple modules and defined in any order.
- # a table with a composite primary key
- invoices = Table('invoices', metadata,
+Foreign keys may also be defined at the table level, using the ``ForeignKeyConstraint`` object. This object can describe a single- or multi-column foreign key. A multi-column foreign key is known as a *composite* foreign key, and almost always references a table that has a composite primary key. Below we define a table ``invoice`` which has a composite primary key::
+
+ invoice = Table('invoice', metadata,
Column('invoice_id', Integer, primary_key=True),
Column('ref_num', Integer, primary_key=True),
Column('description', String(60), nullable=False)
)
-
- # a table with a composite foreign key referencing the parent table
- invoice_items = Table('invoice_items', metadata,
+
+And then a table ``invoice_item`` with a composite foreign key referencing ``invoice``::
+
+ invoice_item = Table('invoice_item', metadata,
Column('item_id', Integer, primary_key=True),
Column('item_name', String(60), nullable=False),
Column('invoice_id', Integer, nullable=False),
Column('ref_num', Integer, nullable=False),
- ForeignKeyConstraint(['invoice_id', 'ref_num'], ['invoices.invoice_id', 'invoices.ref_num'])
+ ForeignKeyConstraint(['invoice_id', 'ref_num'], ['invoice.invoice_id', 'invoice.ref_num'])
)
-Above, the ``invoice_items`` table will have ``ForeignKey`` objects automatically added to the ``invoice_id`` and ``ref_num`` ``Column`` objects as a result of the additional ``ForeignKeyConstraint`` object.
+It's important to note that the ``ForeignKeyConstraint`` is the only way to define a composite foreign key. While we could also have placed individual ``ForeignKey`` objects on both the ``invoice_item.invoice_id`` and ``invoice_item.ref_num`` columns, SQLAlchemy would not be aware that these two values should be paired together - it would be two individual foreign key constraints instead of a single composite foreign key referencing two columns.
Accessing Tables and Columns
----------------------------
-The ``MetaData`` object supports some handy methods, such as getting a list of Tables in the order (or reverse) of their dependency::
+The ``MetaData`` object contains all of the schema constructs we've associated with it. It supports a few methods of accessing these table objects, such as the ``sorted_tables`` accessor which returns a list of each ``Table`` object in order of foreign key dependency (that is, each table is preceded by all tables which it references)::
- >>> for t in metadata.table_iterator(reverse=False):
+ >>> for t in metadata.sorted_tables:
... print t.name
- users
- user_prefs
-
-And ``Table`` provides an interface to the table's properties as well as that of its columns::
+ user
+ user_preference
+ invoice
+ invoice_item
+
+In most cases, individual ``Table`` objects have been explicitly declared, and these objects are typically accessed directly as module-level variables in an application. ``Table`` provides an interface to the table's properties as well as that of its columns::
employees = Table('employees', metadata,
Column('employee_id', Integer, primary_key=True),
.. _metadata_binding:
+
+Creating and Dropping Database Tables
+-------------------------------------
+
+Once you've defined some ``Table`` objects, assuming you're working with a brand new database one thing you might want to do is issue CREATE statements for those tables and their related constructs (as an aside, it's also quite possible that you *don't* want to do this, if you already have some preferred methodology such as tools included with your database or an existing scripting system - if that's the case, feel free to skip this section - SQLAlchemy has no requirement that it be used to create your tables).
+
+The usual way to issue CREATE is to use ``create_all()`` on the ``MetaData`` object. This method will issue queries that first check for the existence of each individual table, and if not found will issue the CREATE statements:
+
+ .. sourcecode:: python+sql
+
+ engine = create_engine('sqlite:///:memory:')
+
+ metadata = MetaData()
+
+ user = Table('user', metadata,
+ Column('user_id', Integer, primary_key = True),
+ Column('user_name', String(16), nullable = False),
+ Column('email_address', String(60), key='email'),
+ Column('password', String(20), nullable = False)
+ )
+
+ user_prefs = Table('user_prefs', metadata,
+ Column('pref_id', Integer, primary_key=True),
+ Column('user_id', Integer, ForeignKey("user.user_id"), nullable=False),
+ Column('pref_name', String(40), nullable=False),
+ Column('pref_value', String(100))
+ )
+
+ {sql}metadata.create_all(engine)
+ PRAGMA table_info(user){}
+ CREATE TABLE user(
+ user_id INTEGER NOT NULL PRIMARY KEY,
+ user_name VARCHAR(16) NOT NULL,
+ email_address VARCHAR(60),
+ password VARCHAR(20) NOT NULL
+ )
+ PRAGMA table_info(user_prefs){}
+ CREATE TABLE user_prefs(
+ pref_id INTEGER NOT NULL PRIMARY KEY,
+ user_id INTEGER NOT NULL REFERENCES user(user_id),
+ pref_name VARCHAR(40) NOT NULL,
+ pref_value VARCHAR(100)
+ )
+
+``create_all()`` creates foreign key constraints between tables usually inline with the table definition itself, and for this reason it also generates the tables in order of their dependency. There are options to change this behavior such that ``ALTER TABLE`` is used instead.
+
+Dropping all tables is similarly achieved using the ``drop_all()`` method. This method does the exact opposite of ``create_all()`` - the presence of each table is checked first, and tables are dropped in reverse order of dependency.
+
+Creating and dropping individual tables can be done via the ``create()`` and ``drop()`` methods of ``Table``. These methods by default issue the CREATE or DROP regardless of the table being present:
+
+.. sourcecode:: python+sql
+
+ engine = create_engine('sqlite:///:memory:')
+
+ meta = MetaData()
+
+ employees = Table('employees', meta,
+ Column('employee_id', Integer, primary_key=True),
+ Column('employee_name', String(60), nullable=False, key='name'),
+ Column('employee_dept', Integer, ForeignKey("departments.department_id"))
+ )
+ {sql}employees.create(engine)
+ CREATE TABLE employees(
+ employee_id SERIAL NOT NULL PRIMARY KEY,
+ employee_name VARCHAR(60) NOT NULL,
+ employee_dept INTEGER REFERENCES departments(department_id)
+ )
+ {}
+
+``drop()`` method:
+
+.. sourcecode:: python+sql
+
+ {sql}employees.drop(engine)
+ DROP TABLE employees
+ {}
+
+To enable the "check first for the table existing" logic, add the ``checkfirst=True`` argument to ``create()`` or ``drop()``::
+
+ employees.create(engine, checkfirst=True)
+ employees.drop(engine, checkfirst=False)
+
+
Binding MetaData to an Engine or Connection
--------------------------------------------
-A ``MetaData`` object can be associated with an ``Engine`` or an individual ``Connection``; this process is called **binding**. The term used to describe "an engine or a connection" is often referred to as a **connectable**. Binding allows the ``MetaData`` and the elements which it contains to perform operations against the database directly, using the connection resources to which it's bound. Common operations which are made more convenient through binding include being able to generate SQL constructs which know how to execute themselves, creating ``Table`` objects which query the database for their column and constraint information, and issuing CREATE or DROP statements.
+Notice in the previous section the creator/dropper methods accept an argument for the database engine in use. When a schema construct is combined with an ``Engine`` object, or an individual ``Connection`` object, we call this the *bind*. The bind above lasts for the duration of the operation, as it is an argument to the method. The ``MetaData`` object also has the option to be persistently bound to a single ``Engine`` or ``Connection`` (which we sometimes call a *connectable*), such that subsequent operations with the ``MetaData`` will automatically use that resource::
-To bind ``MetaData`` to an ``Engine``, use the ``bind`` attribute::
-
- engine = create_engine('sqlite://', **kwargs)
+ engine = create_engine('sqlite://')
# create MetaData
meta = MetaData()
# bind to an engine
meta.bind = engine
-Once this is done, the ``MetaData`` and its contained ``Table`` objects can access the database directly::
+We can now call methods like ``create_all()`` without needing to pass the ``Engine``::
- meta.create_all() # issue CREATE statements for all tables
+ meta.create_all()
+
+The MetaData's bind is used for anything that requires an active connection, such as loading the definition of a table from the database automatically (called *reflection*)::
# describe a table called 'users', query the database for its columns
users_table = Table('users', meta, autoload=True)
-
+
+As well as for executing SQL constructs that are derived from that MetaData's table objects::
+
# generate a SELECT statement and execute
result = users_table.select().execute()
-Note that the feature of binding engines is **completely optional**. All of the operations which take advantage of "bound" ``MetaData`` also can be given an ``Engine`` or ``Connection`` explicitly with which to perform the operation. The equivalent "non-bound" of the above would be::
+Binding the MetaData to the Engine is a **completely optional** feature. The above operations can be achieved without the persistent bind using parameters::
- meta.create_all(engine) # issue CREATE statements for all tables
-
- # describe a table called 'users', query the database for its columns
+ # describe a table called 'users', query the database for its columns
users_table = Table('users', meta, autoload=True, autoload_with=engine)
-
+
# generate a SELECT statement and execute
result = engine.execute(users_table.select())
+Should you use bind ? It's probably best to start without it. If you find yourself constantly needing to specify the same ``Engine`` object throughout the entire application, consider binding as a convenience feature which is applicable to applications that don't have multiple engines in use and don't have the need to reference connections explicitly. It should also be noted that an application which is focused on using the SQLAlchemy ORM will not be dealing explicitly with ``Engine`` or ``Connection`` objects very much in any case.
+
Reflecting Tables
-----------------
+A ``Table`` object can be instructed to load information about itself from the corresponding database schema object already existing within the database. This process is called *reflection*. Most simply you need only specify the name, a ``MetaData`` object, and the ``autoload=True`` flag::
-A ``Table`` object can be created without specifying any of its contained attributes, using the argument ``autoload=True`` in conjunction with the table's name and possibly its schema (if not the databases "default" schema). (You can also specify a list or set of column names to autoload as the kwarg include_columns, if you only want to load a subset of the columns in the actual database.) This will issue the appropriate queries to the database in order to locate all properties of the table required for SQLAlchemy to use it effectively, including its column names and datatypes, foreign and primary key constraints, and in some cases its default-value generating attributes. To use ``autoload=True``, the table's ``MetaData`` object need be bound to an ``Engine`` or ``Connection``, or alternatively the ``autoload_with=<some connectable>`` argument can be passed. Below we illustrate autoloading a table and then iterating through the names of its columns::
-
- >>> messages = Table('messages', meta, autoload=True)
+ >>> messages = Table('messages', meta, autoload=True, autoload_with=engine)
>>> [c.name for c in messages.columns]
['message_id', 'message_name', 'date']
-Note that if a reflected table has a foreign key referencing another table, the related ``Table`` object will be automatically created within the ``MetaData`` object if it does not exist already. Below, suppose table ``shopping_cart_items`` references a table ``shopping_carts``. After reflecting, the ``shopping carts`` table is present:
+The above operation will use the given engine to query the database for information about the ``messages`` table, and will then generate ``Column``, ``ForeignKey``, and other objects corresponding to this information as though the ``Table`` object were hand-constructed in Python.
-.. sourcecode:: pycon+sql
+When tables are reflected, if a given table references another one via foreign key, a second ``Table`` object is created within the ``MetaData`` object representing the connection. Below, assume the table ``shopping_cart_items`` references a table named ``shopping_carts``. Reflecting the ``shopping_cart_items`` table has the effect such that the ``shopping_carts`` table will also be loaded::
>>> shopping_cart_items = Table('shopping_cart_items', meta, autoload=True)
>>> 'shopping_carts' in meta.tables:
True
-To get direct access to 'shopping_carts', simply instantiate it via the ``Table`` constructor. ``Table`` uses a special constructor that will return the already created ``Table`` instance if it's already present:
-
-.. sourcecode:: python+sql
+The ``MetaData`` has an interesting "singleton-like" behavior such that if you requested both tables individually, ``MetaData`` will ensure that each table name is only used once - the ``Table`` constructor actually returns to you the already-existing ``Table`` object if one exists. Such as below, we can access the already generated ``shopping_carts`` table just by naming it::
shopping_carts = Table('shopping_carts', meta)
Overriding Reflected Columns
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
Individual columns can be overridden with explicit values when reflecting tables; this is handy for specifying custom datatypes, constraints such as primary keys that may not be configured within the database, etc.::
>>> mytable = Table('mytable', meta,
mysql_engine='InnoDB'
)
-Creating and Dropping Database Tables
-======================================
-
-Creating and dropping individual tables can be done via the ``create()`` and ``drop()`` methods of ``Table``; these methods take an optional ``bind`` parameter which references an ``Engine`` or a ``Connection``. If not supplied, the ``Engine`` bound to the ``MetaData`` will be used, else an error is raised:
-
-.. sourcecode:: python+sql
-
- meta = MetaData()
- meta.bind = 'sqlite:///:memory:'
-
- employees = Table('employees', meta,
- Column('employee_id', Integer, primary_key=True),
- Column('employee_name', String(60), nullable=False, key='name'),
- Column('employee_dept', Integer, ForeignKey("departments.department_id"))
- )
- {sql}employees.create()
- CREATE TABLE employees(
- employee_id SERIAL NOT NULL PRIMARY KEY,
- employee_name VARCHAR(60) NOT NULL,
- employee_dept INTEGER REFERENCES departments(department_id)
- )
- {}
-
-``drop()`` method:
-
-.. sourcecode:: python+sql
-
- {sql}employees.drop(bind=e)
- DROP TABLE employees
- {}
-
-The ``create()`` and ``drop()`` methods also support an optional keyword argument ``checkfirst`` which will issue the database's appropriate pragma statements to check if the table exists before creating or dropping::
-
- employees.create(bind=e, checkfirst=True)
- employees.drop(checkfirst=False)
-
-Entire groups of Tables can be created and dropped directly from the ``MetaData`` object with ``create_all()`` and ``drop_all()``. These methods always check for the existence of each table before creating or dropping. Each method takes an optional ``bind`` keyword argument which can reference an ``Engine`` or a ``Connection``. If no engine is specified, the underlying bound ``Engine``, if any, is used:
-
-.. sourcecode:: python+sql
-
- engine = create_engine('sqlite:///:memory:')
-
- metadata = MetaData()
-
- users = Table('users', metadata,
- Column('user_id', Integer, primary_key = True),
- Column('user_name', String(16), nullable = False),
- Column('email_address', String(60), key='email'),
- Column('password', String(20), nullable = False)
- )
-
- user_prefs = Table('user_prefs', metadata,
- Column('pref_id', Integer, primary_key=True),
- Column('user_id', Integer, ForeignKey("users.user_id"), nullable=False),
- Column('pref_name', String(40), nullable=False),
- Column('pref_value', String(100))
- )
-
- {sql}metadata.create_all(bind=engine)
- PRAGMA table_info(users){}
- CREATE TABLE users(
- user_id INTEGER NOT NULL PRIMARY KEY,
- user_name VARCHAR(16) NOT NULL,
- email_address VARCHAR(60),
- password VARCHAR(20) NOT NULL
- )
- PRAGMA table_info(user_prefs){}
- CREATE TABLE user_prefs(
- pref_id INTEGER NOT NULL PRIMARY KEY,
- user_id INTEGER NOT NULL REFERENCES users(user_id),
- pref_name VARCHAR(40) NOT NULL,
- pref_value VARCHAR(100)
- )
Column Insert/Update Defaults
==============================