SQLAlchemy 0.3 Documentation
- More On Mapper Properties
- Overriding Column Names
- Overriding Properties
- Custom List Classes
- Custom Join Conditions
- Lazy/Eager Joins Multiple Times to One Table
- Deferred Column Loading
- Working with Large Collections
- Relation Options
- Controlling Ordering
- Limiting Rows Combined with Eager Loads
- Mapping a Class with Table Inheritance
- Mapping a Class against Multiple Tables
- Mapping a Class against Arbitrary Selects
- Multiple Mappers for One Class
- Self Referential Mappers
- Statement and Result-Set ORM Queries
- Mapper Keyword Arguments
- Extending Mapper
This section details all the options available to Mappers, as well as advanced patterns.
To start, heres the tables we will work with again:
from sqlalchemy import * metadata = MetaData() # a table to store users users_table = Table('users', metadata, Column('user_id', Integer, primary_key = True), Column('user_name', String(40)), Column('password', String(80)) ) # a table that stores mailing addresses associated with a specific user addresses_table = Table('addresses', metadata, Column('address_id', Integer, primary_key = True), Column('user_id', Integer, ForeignKey("users.user_id")), Column('street', String(100)), Column('city', String(80)), Column('state', String(2)), Column('zip', String(10)) ) # a table that stores keywords keywords_table = Table('keywords', metadata, Column('keyword_id', Integer, primary_key = True), Column('name', VARCHAR(50)) ) # a table that associates keywords with users userkeywords_table = Table('userkeywords', metadata, Column('user_id', INT, ForeignKey("users")), Column('keyword_id', INT, ForeignKey("keywords")) )
More On Mapper Properties
Overriding Column Names
When mappers are constructed, by default the column names in the Table metadata are used as the names of attributes on the mapped class. This can be customzed within the properties by stating the key/column combinations explicitly:
user_mapper = mapper(User, users_table, properties={ 'id' : users_table.c.user_id, 'name' : users_table.c.user_name, })
In the situation when column names overlap in a mapper against multiple tables, columns may be referenced together with a list:
# join users and addresses usersaddresses = sql.join(users_table, addresses_table, users_table.c.user_id == addresses_table.c.user_id) m = mapper(User, usersaddresses, properties = { 'id' : [users_table.c.user_id, addresses_table.c.user_id], } )
Overriding Properties
A common request is the ability to create custom class properties that override the behavior of setting/getting an attribute. Currently, the easiest way to do this in SQLAlchemy is how it would be done in any Python program; define your attribute with a different name, such as "_attribute", and use a property to get/set its value. The mapper just needs to be told of the special name:
class MyClass(object): def _set_email(self, email): self._email = email def _get_email(self): return self._email email = property(_get_email, _set_email) mapper(MyClass, mytable, properties = { # map the '_email' attribute to the "email" column # on the table '_email': mytable.c.email })
It is also possible to route the the select_by and get_by functions on Query using the new property name, by establishing a synonym:
mapper(MyClass, mytable, properties = { # map the '_email' attribute to the "email" column # on the table '_email': mytable.c.email, # make a synonym 'email' 'email' : synonym('_email') }) # now you can select_by(email) result = session.query(MyClass).select_by(email='john@smith.com')
Synonym can be established with the flag "proxy=True", to create a class-level proxy to the actual property. This has the effect of creating a fully functional synonym on class instances:
mapper(MyClass, mytable, properties = { '_email': mytable.c.email 'email' : synonym('_email', proxy=True) }) x = MyClass() x.email = 'john@doe.com' >>> x._email 'john@doe.com'
Custom List Classes
Feature Status: Alpha API
A one-to-many or many-to-many relationship results in a list-holding element being attached to all instances of a class. The actual list is an "instrumented" list, which transparently maintains a relationship to a plain Python list. The implementation of the underlying plain list can be changed to be any object that implements a list-style append and __iter__ method. A common need is for a list-based relationship to actually be a dictionary. This can be achieved by subclassing dict to have list-like behavior.
In this example, a class MyClass is defined, which is associated with a parent object MyParent. The collection of MyClass objects on each MyParent object will be a dictionary, storing each MyClass instance keyed to its name attribute.
# a class to be stored in the list class MyClass(object): def __init__(self, name): self.name = name # create a dictionary that will act like a list, and store # instances of MyClass class MyDict(dict): def append(self, item): self[item.name] = item def __iter__(self): return self.values() # parent class class MyParent(object): pass # mappers, constructed normally mapper(MyClass, myclass_table) mapper(MyParent, myparent_table, properties={ 'myclasses' : relation(MyClass, collection_class=MyDict) }) # elements on 'myclasses' can be accessed via string keyname myparent = MyParent() myparent.myclasses.append(MyClass('this is myclass')) myclass = myparent.myclasses['this is myclass']
Note: SQLAlchemy 0.4 has an overhauled and much improved implementation for custom list classes, with some slight API changes.
back to section topCustom Join Conditions
When creating relations on a mapper, most examples so far have illustrated the mapper and relationship joining up based on the foreign keys of the tables they represent. in fact, this "automatic" inspection can be completely circumvented using the primaryjoin and secondaryjoin arguments to relation, as in this example which creates a User object which has a relationship to all of its Addresses which are in Boston:
class User(object): pass class Address(object): pass mapper(Address, addresses_table) mapper(User, users_table, properties={ 'boston_addresses' : relation(Address, primaryjoin= and_(users_table.c.user_id==Address.c.user_id, Addresses.c.city=='Boston')) })
Many to many relationships can be customized by one or both of primaryjoin and secondaryjoin, shown below with just the default many-to-many relationship explicitly set:
class User(object): pass class Keyword(object): pass mapper(Keyword, keywords_table) mapper(User, users_table, properties={ 'keywords':relation(Keyword, secondary=userkeywords_table, primaryjoin=users_table.c.user_id==userkeywords_table.c.user_id, secondaryjoin=userkeywords_table.c.keyword_id==keywords_table.c.keyword_id ) })
Lazy/Eager Joins Multiple Times to One Table
The previous example leads in to the idea of joining against the same table multiple times. Below is a User object that has lists of its Boston and New York addresses:
mapper(User, users_table, properties={ 'boston_addresses' : relation(Address, primaryjoin= and_(users_table.c.user_id==Address.c.user_id, Addresses.c.city=='Boston')), 'newyork_addresses' : relation(Address, primaryjoin= and_(users_table.c.user_id==Address.c.user_id, Addresses.c.city=='New York')), })
Both lazy and eager loading support multiple joins equally well.
back to section topDeferred Column Loading
This feature allows particular columns of a table to not be loaded by default, instead being loaded later on when first referenced. It is essentailly "column-level lazy loading". This feature is useful when one wants to avoid loading a large text or binary field into memory when its not needed. Individual columns can be lazy loaded by themselves or placed into groups that lazy-load together.
book_excerpts = Table('books', db, Column('book_id', Integer, primary_key=True), Column('title', String(200), nullable=False), Column('summary', String(2000)), Column('excerpt', String), Column('photo', Binary) ) class Book(object): pass # define a mapper that will load each of 'excerpt' and 'photo' in # separate, individual-row SELECT statements when each attribute # is first referenced on the individual object instance mapper(Book, book_excerpts, properties = { 'excerpt' : deferred(book_excerpts.c.excerpt), 'photo' : deferred(book_excerpts.c.photo) })
Deferred columns can be placed into groups so that they load together:
book_excerpts = Table('books', db, Column('book_id', Integer, primary_key=True), Column('title', String(200), nullable=False), Column('summary', String(2000)), Column('excerpt', String), Column('photo1', Binary), Column('photo2', Binary), Column('photo3', Binary) ) class Book(object): pass # define a mapper with a 'photos' deferred group. when one photo is referenced, # all three photos will be loaded in one SELECT statement. The 'excerpt' will # be loaded separately when it is first referenced. mapper(Book, book_excerpts, properties = { 'excerpt' : deferred(book_excerpts.c.excerpt), 'photo1' : deferred(book_excerpts.c.photo1, group='photos'), 'photo2' : deferred(book_excerpts.c.photo2, group='photos'), 'photo3' : deferred(book_excerpts.c.photo3, group='photos') })
You can defer or undefer columns at the Query level with the options method:
query = session.query(Book) query.options(defer('summary')).all() query.options(undefer('excerpt')).all()
Working with Large Collections
SQLAlchemy relations are generally simplistic; the lazy loader loads in the full list of child objects when accessed, and the eager load builds a query that loads the full list of child objects. Additionally, when you are deleting a parent object, SQLAlchemy ensures that it has loaded the full list of child objects so that it can mark them as deleted as well (or to update their parent foreign key to NULL). It does not issue an en-masse "delete from table where parent_id=?" type of statement in such a scenario. This is because the child objects themselves may also have further dependencies, and additionally may also exist in the current session in which case SA needs to know their identity so that their state can be properly updated.
So there are several techniques that can be used individually or combined together to address these issues, in the context of a large collection where you normally would not want to load the full list of relationships:
Use
lazy=Noneto disable child object loading (i.e. noload)mapper(MyClass, table, properties=relation{ 'children':relation(MyOtherClass, lazy=None) })
To load child objects, just use a query. Of particular convenience is that
Queryis a generative object, so you can return it as is, allowing additional criterion to be added as needed:class Organization(object): def __init__(self, name): self.name = name member_query = property(lambda self: object_session(self).query(Member).with_parent(self)) myorg = sess.query(Organization).get(5) # get all members members = myorg.member_query.list() # query a subset of members using LIMIT/OFFSET members = myorg.member_query[5:10]
Use
passive_deletes=Trueto disable child object loading on a DELETE operation, in conjunction with "ON DELETE (CASCADE|SET NULL)" on your database to automatically cascade deletes to child objects. Note that "ON DELETE" is not supported on SQLite, and requiresInnoDBtables when using MySQL:mytable = Table('mytable', meta, Column('id', Integer, primary_key=True), ) myothertable = Table('myothertable', meta, Column('id', Integer, primary_key=True), Column('parent_id', Integer), ForeignKeyConstraint(['parent_id'],['mytable.id'], ondelete="CASCADE"), ) mmapper(MyOtherClass, myothertable) mapper(MyClass, mytable, properties={ 'children':relation(MyOtherClass, passive_deletes=True) })
As an alternative to using "ON DELETE CASCADE", for very simple scenarios you can create a simple
MapperExtensionthat will issue a DELETE for child objects before the parent object is deleted:class DeleteMemberExt(MapperExtension): def before_delete(self, mapper, connection, instance): connection.execute(member_table.delete(member_table.c.org_id==instance.org_id)) mapper(Organization, org_table, extension=DeleteMemberExt(), properties = { 'members' : relation(Member, lazy=None, passive_deletes=True, cascade="all, delete-orphan") })
Note that this approach is not nearly as efficient or general-purpose as "ON DELETE CASCADE", since the database itself can cascade the operation along any number of tables.
The latest distribution includes an example examples/collection/large_collection.py which illustrates most of these techniques.
Relation Options
Options which can be sent to the relation() function. For arguments to mapper(), see Mapper Keyword Arguments.
- association - Deprecated; as of version 0.3.0 the association keyword is synonomous with applying the "all, delete-orphan" cascade to a "one-to-many" relationship. SA can now automatically reconcile a "delete" and "insert" operation of two objects with the same "identity" in a flush() operation into a single "update" statement, which is the pattern that "association" used to indicate. See the updated example of association mappings in datamapping_association.
-
backref - indicates the name of a property to be placed on the related mapper's class that will handle this relationship in the other direction, including synchronizing the object attributes on both sides of the relation. Can also point to a
backref()construct for more configurability. See Backreferences. - cascade - a string list of cascade rules which determines how persistence operations should be "cascaded" from parent to child. For a description of cascade rules, see Lifecycle Relations and Cascade rules.
- collection_class - a class or function that returns a new list-holding object. will be used in place of a plain list for storing elements. See Custom List Classes.
-
foreign_keys - a list of columns which are to be used as "foreign key" columns. this parameter should be used in conjunction with explicit
primaryjoinandsecondaryjoin(if needed) arguments, and the columns within theforeign_keyslist should be present within those join conditions. Normally,relation()will inspect the columns within the join conditions to determine which columns are the "foreign key" columns, based on information in theTablemetadata. Use this argument when no ForeignKey's are present in the join condition, or to override the table-defined foreign keys. -
foreignkey - deprecated. use the
foreign_keysargument for foreign key specification, orremote_sidefor "directional" logic. -
lazy=True - specifies how the related items should be loaded. a value of True indicates they should be loaded lazily when the property is first accessed. A value of False indicates they should be loaded by joining against the parent object query, so parent and child are loaded in one round trip (i.e. eagerly). A value of None indicates the related items are not loaded by the mapper in any case; the application will manually insert items into the list in some other way. In all cases, items added or removed to the parent object's collection (or scalar attribute) will cause the appropriate updates and deletes upon flush(), i.e. this option only affects load operations, not save operations.
- order_by - indicates the ordering that should be applied when loading these items. See the section Controlling Ordering for details.
-
passive_deletes=False - Indicates if lazy-loaders should not be executed during the
flush()process, which normally occurs in order to locate all existing child items when a parent item is to be deleted. Setting this flag to True is appropriate whenON DELETE CASCADErules have been set up on the actual tables so that the database may handle cascading deletes automatically. This strategy is useful particularly for handling the deletion of objects that have very large (and/or deep) child-object collections. See the example in Working with Large Collections. -
post_update - this indicates that the relationship should be handled by a second UPDATE statement after an INSERT or before a DELETE. Currently, it also will issue an UPDATE after the instance was UPDATEd as well, although this technically should be improved. This flag is used to handle saving bi-directional dependencies between two individual rows (i.e. each row references the other), where it would otherwise be impossible to INSERT or DELETE both rows fully since one row exists before the other. Use this flag when a particular mapping arrangement will incur two rows that are dependent on each other, such as a table that has a one-to-many relationship to a set of child rows, and also has a column that references a single child row within that list (i.e. both tables contain a foreign key to each other). If a
flush()operation returns an error that a "cyclical dependency" was detected, this is a cue that you might want to usepost_updateto "break" the cycle. - primaryjoin - a ClauseElement that will be used as the primary join of this child object against the parent object, or in a many-to-many relationship the join of the primary object to the association table. By default, this value is computed based on the foreign key relationships of the parent and child tables (or association table).
-
private=False - deprecated. setting
private=Trueis the equivalent of settingcascade="all, delete-orphan", and indicates the lifecycle of child objects should be contained within that of the parent. See the example in Lifecycle Relations. - remote_side - used for self-referential relationships, indicates the column or list of columns that form the "remote side" of the relationship. See the examples in Self Referential Mappers.
-
secondary - for a many-to-many relationship, specifies the intermediary table. The
secondarykeyword argument should generally only be used for a table that is not otherwise expressed in any class mapping. In particular, using the Association Object Pattern is generally mutually exclusive against using thesecondarykeyword argument. - secondaryjoin - a ClauseElement that will be used as the join of an association table to the child object. By default, this value is computed based on the foreign key relationships of the association and child tables.
-
uselist=(True|False) - a boolean that indicates if this property should be loaded as a list or a scalar. In most cases, this value is determined automatically by
relation(), based on the type and direction of the relationship - one to many forms a list, many to one forms a scalar, many to many is a list. If a scalar is desired where normally a list would be present, such as a bi-directional one-to-one relationship, set uselist to False. - viewonly=False - when set to True, the relation is used only for loading objects within the relationship, and has no effect on the unit-of-work flush process. Relations with viewonly can specify any kind of join conditions to provide additional views of related objects onto a parent object. Note that the functionality of a viewonly relationship has its limits - complicated join conditions may not compile into eager or lazy loaders properly. If this is the case, use an alternative method, such as those described in Working with Large Collections, Statement and Result-Set ORM Queries, or Mapping a Class against Arbitrary Selects.
Controlling Ordering
By default, mappers will attempt to ORDER BY the "oid" column of a table, or the primary key column, when selecting rows. This can be modified in several ways.
The "order_by" parameter can be sent to a mapper, overriding the per-engine ordering if any. A value of None means that the mapper should not use any ordering. A non-None value, which can be a column, an asc or desc clause, or an array of either one, indicates the ORDER BY clause that should be added to all select queries:
# disable all ordering mapper = mapper(User, users_table, order_by=None) # order by a column mapper = mapper(User, users_table, order_by=users_tableusers_table.c.user_id) # order by multiple items mapper = mapper(User, users_table, order_by=[users_table.c.user_id, desc(users_table.c.user_name)])
"order_by" can also be specified with queries, overriding all other per-engine/per-mapper orderings:
# order by a column l = query.filter(users_table.c.user_name=='fred').order_by(users_table.c.user_id).all() # order by multiple criterion l = query.filter(users_table.c.user_name=='fred').order_by([users_table.c.user_id, desc(users_table.c.user_name)])
The "order_by" property can also be specified on a relation() which will control the ordering of the collection:
mapper(Address, addresses_table) # order address objects by address id mapper(User, users_table, properties = { 'addresses' : relation(Address, order_by=addresses_table.c.address_id) })
back to section top
Limiting Rows Combined with Eager Loads
As indicated in the docs on Query, you can limit rows using limit() and offset(). However, things get tricky when dealing with eager relationships, since a straight LIMIT of rows will interfere with the eagerly-loaded rows. So here is what SQLAlchemy will do when you use limit or offset with an eager relationship:
class User(object): pass class Address(object): pass mapper(User, users_table, properties={ 'addresses' : relation(mapper(Address, addresses_table), lazy=False) }) r = session.query(User).filter(User.c.user_name.like('F%')).limit(20).offset(10).all() {opensql}SELECT users.user_id AS users_user_id, users.user_name AS users_user_name, users.password AS users_password, addresses.address_id AS addresses_address_id, addresses.user_id AS addresses_user_id, addresses.street AS addresses_street, addresses.city AS addresses_city, addresses.state AS addresses_state, addresses.zip AS addresses_zip FROM (SELECT users.user_id FROM users WHERE users.user_name LIKE %(users_user_name)s ORDER BY users.oid LIMIT 20 OFFSET 10) AS rowcount, users LEFT OUTER JOIN addresses ON users.user_id = addresses.user_id WHERE rowcount.user_id = users.user_id ORDER BY users.oid, addresses.oid {'users_user_name': 'F%'}
The main WHERE clause as well as the limiting clauses are coerced into a subquery; this subquery represents the desired result of objects. A containing query, which handles the eager relationships, is joined against the subquery to produce the result. This is something to keep in mind as it's a complex query which may be problematic on databases with poor support for LIMIT, such as Oracle which does not support it natively.
back to section topMapping a Class with Table Inheritance
Inheritance in databases comes in three forms: single table inheritance, where several types of classes are stored in one table, concrete table inheritance, where each type of class is stored in its own table, and joined table inheritance, where the parent/child classes are stored in their own tables that are joined together in a select.
There is also the ability to load "polymorphically", which is that a single query loads objects of multiple types at once.
SQLAlchemy supports all three kinds of inheritance. Additionally, true "polymorphic" loading is supported in a straightfoward way for single table inheritance, and has some more manually-configured features that can make it happen for concrete and multiple table inheritance.
Working examples of polymorphic inheritance come with the distribution in the directory examples/polymorphic.
Here are the classes we will use to represent an inheritance relationship:
class Employee(object): def __init__(self, name): self.name = name def __repr__(self): return self.__class__.__name__ + " " + self.name class Manager(Employee): def __init__(self, name, manager_data): self.name = name self.manager_data = manager_data def __repr__(self): return self.__class__.__name__ + " " + self.name + " " + self.manager_data class Engineer(Employee): def __init__(self, name, engineer_info): self.name = name self.engineer_info = engineer_info def __repr__(self): return self.__class__.__name__ + " " + self.name + " " + self.engineer_info
Each class supports a common name attribute, while the Manager class has its own attribute manager_data and the Engineer class has its own attribute engineer_info.
Single Table Inheritance
This will support polymorphic loading via the Employee mapper.
employees_table = Table('employees', metadata, Column('employee_id', Integer, primary_key=True), Column('name', String(50)), Column('manager_data', String(50)), Column('engineer_info', String(50)), Column('type', String(20)) ) employee_mapper = mapper(Employee, employees_table, polymorphic_on=employees_table.c.type) manager_mapper = mapper(Manager, inherits=employee_mapper, polymorphic_identity='manager') engineer_mapper = mapper(Engineer, inherits=employee_mapper, polymorphic_identity='engineer')
Concrete Table Inheritance
Without polymorphic loading, you just define a separate mapper for each class.
managers_table = Table('managers', metadata, Column('employee_id', Integer, primary_key=True), Column('name', String(50)), Column('manager_data', String(50)), ) engineers_table = Table('engineers', metadata, Column('employee_id', Integer, primary_key=True), Column('name', String(50)), Column('engineer_info', String(50)), ) manager_mapper = mapper(Manager, managers_table) engineer_mapper = mapper(Engineer, engineers_table)
With polymorphic loading, the SQL query to do the actual polymorphic load must be constructed, usually as a UNION. There is a helper function to create these UNIONS called polymorphic_union.
pjoin = polymorphic_union({ 'manager':managers_table, 'engineer':engineers_table }, 'type', 'pjoin') employee_mapper = mapper(Employee, pjoin, polymorphic_on=pjoin.c.type) manager_mapper = mapper(Manager, managers_table, inherits=employee_mapper, concrete=True, polymorphic_identity='manager') engineer_mapper = mapper(Engineer, engineers_table, inherits=employee_mapper, concrete=True, polymorphic_identity='engineer')
Joined Table Inheritance
Like concrete table inheritance, this can be done non-polymorphically, or with a little more complexity, polymorphically:
employees = Table('employees', metadata, Column('person_id', Integer, primary_key=True), Column('name', String(50)), Column('type', String(30))) engineers = Table('engineers', metadata, Column('person_id', Integer, ForeignKey('employees.person_id'), primary_key=True), Column('engineer_info', String(50)), ) managers = Table('managers', metadata, Column('person_id', Integer, ForeignKey('employees.person_id'), primary_key=True), Column('manager_data', String(50)), ) person_mapper = mapper(Employee, employees) mapper(Engineer, engineers, inherits=person_mapper) mapper(Manager, managers, inherits=person_mapper)
Polymorphically, joined-table inheritance is easier than concrete, as a simple outer join can usually work:
person_join = people.outerjoin(engineers).outerjoin(managers) person_mapper = mapper(Person, people, select_table=person_join,polymorphic_on=people.c.type, polymorphic_identity='person') mapper(Engineer, engineers, inherits=person_mapper, polymorphic_identity='engineer') mapper(Manager, managers, inherits=person_mapper, polymorphic_identity='manager')
In SQLAlchemy 0.4, the above mapper setup can load polymorphically without the join as well, by issuing distinct queries for each subclasses' table.
The join condition in a joined table inheritance structure can be specified explicitly, using inherit_condition:
AddressUser.mapper = mapper( AddressUser, addresses_table, inherits=User.mapper, inherit_condition=users_table.c.user_id==addresses_table.c.user_id )
Mapping a Class against Multiple Tables
Mappers can be constructed against arbitrary relational units (called Selectables) as well as plain Tables. For example, The join keyword from the SQL package creates a neat selectable unit comprised of multiple tables, complete with its own composite primary key, which can be passed in to a mapper as the table.
# a class class AddressUser(object): pass # define a Join j = join(users_table, addresses_table) # map to it - the identity of an AddressUser object will be # based on (user_id, address_id) since those are the primary keys involved m = mapper(AddressUser, j, properties={ 'user_id':[users_table.c.user_id, addresses_table.c.user_id] })
A second example:
# many-to-many join on an association table j = join(users_table, userkeywords, users_table.c.user_id==userkeywords.c.user_id).join(keywords, userkeywords.c.keyword_id==keywords.c.keyword_id) # a class class KeywordUser(object): pass # map to it - the identity of a KeywordUser object will be # (user_id, keyword_id) since those are the primary keys involved m = mapper(KeywordUser, j, properties={ 'user_id':[users_table.c.user_id, userkeywords.c.user_id], 'keyword_id':[userkeywords.c.keyword_id, keywords.c.keyword_id] })
In both examples above, "composite" columns were added as properties to the mappers; these are aggregations of multiple columns into one mapper property, which instructs the mapper to keep both of those columns set at the same value.
back to section topMapping a Class against Arbitrary Selects
Similar to mapping against a join, a plain select() object can be used with a mapper as well. Below, an example select which contains two aggregate functions and a group_by is mapped to a class:
s = select([customers, func.count(orders).label('order_count'), func.<
