| title |
|---|
SQLAlchemy / ORM (Object Relational Mapper) |
SQLAlchemy / ORM (Object Relational Mapper)
- Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
SQLAlchemy ORM (Object Relational Mapper) 提出一種將 user-defined Python class 跟 database table 聯結 (associate) 起來的方法,也就是每個 instance 都對應到個別的 table row。有個 unit of work 的機制可以自動同步 object 與 row 的改變 (transparently),也可以用 class 及宣告的 relationship 來表達 database query。
-
ORM 是基於 SQLAlchemy 的 SQL Expression Language (用 Python constructs 間接執行 SQL statements),一個 application 完全只用 ORM 是沒有問題的,極少數情況下才需要直接使用 SQL Expression Language。
-
生成 mapped class instance 只是第一步,要寫進 database 得透過
Session.add()加進 session 才行,下次 flush 時就會被寫進資料庫。>>> ed_user = User(name='ed', fullname='Ed Jones', password='edspassword') >>> session.add(ed_user) -
加進 session 但還沒寫入資料庫前,這個 instance 處於 pending state (未加入 session 前是 transient state),因為 SQL statement 尚未執行,而且在 database 裡也還沒有對應的 row (還沒有 PK)。
Session會在需要時執行 SQL 寫入資料庫,這個動作稱做 flush,例如在 query 前 - 因為 query 發生在 database 端,異動先進 database (但還沒 commit),查詢的結果才能反應最新的異動。>>> our_user = session.query(User).filter_by(name='ed').first() >>> our_user <User(name='ed', fullname='Ed Jones', password='edspassword')> -
上面 query 的動作會先 flush 再執行 query。這裡傳回的
our_user會等同於ed_user(assert our_user is ed_user),那是因為 session 會維護一個 identity map,在同一個 session 裡,在資料庫裡具有特定 primary key 的 row,只會對應到 session 裡的一個 Python object。
-
-
Declare a Mapping - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
When using the ORM, the configurational process starts by describing the database tables we’ll be dealing with, and then by defining our own classes which will be mapped to those tables. In modern SQLAlchemy, these two tasks are usually PERFORMED TOGETHER, using a system known as DECLARATIVE, which allows us to create classes that INCLUDE DIRECTIVES TO DESCRIBE THE ACTUAL DATABASE TABLE they will be mapped to.
這段話很重要,但要先走過 SQLAlchemy Core 才知道單純的宣告 metadata 是怎麼回事,所以 declarative = mapper(class + metadata),在 class 裡直接提供生成 metadata 需要的素材 (例如
__tablename__、relationship()等),這中間的魔法來自 declarative base class 的 metaclass 機制。 -
Declarative system 需要先定義 mapping (實質上就是 mapped class),裡面主要是 table name、column name/datatype 與 database table 的對應關係。這些 class 要繼承一個 declarative base class (用
sqlalchemy.ext.declarative.declarative_base()產生),內部會維護 class/table 的清單 (catalog),通常一個 app 只會有一個 base class。 -
Mapped class 至少要宣告
__tablename__跟一個Column做為 PK,因為 SQLAlchemy 不對 table name、data type、constraints 等不做任何假設;不過自訂 custom base class 或 mixin 是被鼓勵的 #ril -
以
User對應userstable 為例,有id(PK)、name、fullname與password4 個欄位;另外__repr__()不一定要實作,但印到 logs 時好像很方便?from sqlalchemy.ext.declarative import declarative_base from sqlalchemy import Column, Integer, String Base = declarative_base() class User(Base): __tablename__ = 'users' id = Column(Integer, primary_key=True) name = Column(String) fullname = Column(String) password = Column(String) def __repr__(self): return "<User(name='%s', fullname='%s', password='%s')>" % ( self.name, self.fullname, self.password) -
當 class 建立時,declarative system 透過 metaclass (來自 declarative base class) 將所有的
Column置換成 descriptor (對 attribute access 動手腳)、並自動產生 table metadata -Table(可以從__table__attribute 取得)、透過Mapper建立兩邊的聯結,形成 mapped class,這個過程稱做 instrumentation,可以同步 class 與 database 兩邊的資料;這要先看過 SQL Expression Language 前面才會懂。>>> User.__table__ Table('users', MetaData(bind=None), Column('id', Integer(), table=<users>, primary_key=True, nullable=False), Column('name', String(), table=<users>), Column('fullname', String(), table=<users>), Column('password', String(), table=<users>), schema=None) -
Mapped class 跟其他 class 沒什麼不同,可以宣告其他的 attribute、method 等。
-
雖然建議使用 declarative system,但要使用 SQLAlchemy ORM 也可以走 classical mapping - 任何 Python class 都可以跟透過
mapper()產生Mapper,以建立 class 與Table的聯結。 -
Table只是MetaData的一員,可以從 declarative base class 的metadataattribute 取得,可以把它想成是整個 schema,底下有許多Table。可以想見,透過MetaData.create_all()(搭配Engine做為 database connectivity) 就可以建立資料庫 - 把不存在的 table 建立起來。>>> Base.metadata.create_all(engine) # 這時才會用到 engine 跟資料庫溝通 SELECT ... PRAGMA table_info("users") () CREATE TABLE users ( id INTEGER NOT NULL, name VARCHAR, fullname VARCHAR, password VARCHAR, PRIMARY KEY (id) ) () COMMIT -
Minimal Table Descriptions vs. Full Descriptions 提到 "The length field on String, as well as similar precision/scale fields available on Integer, Numeric, etc. are not referenced by SQLAlchemy other than when creating tables.",也就是說透過 SQLAlchemy 來建立資料結構時這些 constraint 才會作用,跟 validation 無關。
-
要新增一個 instance/row 時,首先要生成一個 mapped class 的 instance:
>>> ed_user = User(name='ed', fullname='Ed Jones', password='edspassword') >>> ed_user.name 'ed' >>> ed_user.password 'edspassword' >>> str(ed_user.id) 'None' -
顯然 instrumentation 還提供了
__init__()constructor,可以接受以 column 為名的 keyword arguments,除非有自訂__init__()。
-
-
instrumentation - Glossary — SQLAlchemy 1.2 Documentation
- Instrumentation refers to the process of augmenting the FUNCTIONALITY and ATTRIBUTE SET of a particular class. Ideally, the behavior of the class should REMAIN CLOSE TO A REGULAR CLASS, except that additional behaviors and features are made available.
- The SQLAlchemy MAPPING PROCESS, among other things, adds DATABASE-ENABLED descriptors to a mapped class which each represent a particular database column or relationship to a related class.
-
SQL Expression Language Tutorial — SQLAlchemy 1.2 Documentation 這裡完全沒提到 session,是 ORM 才有的概念。
-
Creating a Session - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
ORM 跟 database 對話的接口 (handle) 是
Session,一個 session 是一個 workspace,對應一個 database connection (來自 engine 的 connection pool)。首先得用sqlalchemy.orm.sessionmaker()產生一個Session的 factory (可以設定 auto flush、auto commit 等),有趣的是這個 factory 也習慣命名為Session(callable),然後呼叫它以產生一個Sessioninstance ... 這聽起來有點繞口? This custom-madeSessionclass will create newSessionobjects which are bound to our database.from sqlalchemy.orm import sessionmaker Session = sessionmaker(bind=engine) # Session.configure(bind=engine) # 事後綁定 session = Session() -
跟
Engine一樣,雖然Session()生成了Sessioninstance,但還沒有建立連線,開始使用時才會從 engine 的 connection pool 取用一個 connection,直到 commit 或關閉 session 時。 -
Session Lifecycle Patterns 提到
Sessioninstance 的產生時機會因應用類型而異,強調 session 是個 workspace - local to a particular database connection。 -
Session Lifecycle Patterns 強調 "the Session is just a workspace for your objects, local to a particular database connection",workspace 這說法類似於 Git 的 working tree,有助於理解 transient、pending、dirty、new 這些說法 #ril
-
-
State Management — SQLAlchemy 1.2 Documentation object Mapped class 的 instance 在 session 看來有 transient、pending、persistent、deleted 四種狀態 #ril
日前重新看過 SQLAlchemy Core & ORM 的文件,覺得 SQLAlchemy 的 metadata 主要是跟 SQL DDL (CREATE XXX) 有關,跟 SQL DML (SELECT, UPDATE, DELETE 等) 應該沒有關係才對。
這樣的想法透過下面的實驗得到證實;注意 ForeignKey() 跟 relationship() 都是一般的用法:
class User(Base):
__tablename__ = 'user'
id = Column(Integer, primary_key=True)
name = Column(String)
addresses = relationship('Address', order_by='Address.id', back_populates='user')
class Address(Base):
__tablename__ = 'address'
id = Column(Integer, primary_key=True)
email_address = Column(String, nullable=False)
_user_id = Column('user_id', Integer, ForeignKey('user.id'))
user = relationship('User', back_populates='addresses')
def test_orm_without_fk_constraints__mysql(mysql_engine):
conn = mysql_engine.connect()
trans = conn.begin()
try:
# 刻意用單純的 SQL 來建立 table (而非 Metadata.create_all()),確認 DB 裡沒有 FK constraints
conn.execute("CREATE TABLE user (id INT NOT NULL AUTO_INCREMENT, name CHAR(20) NOT NULL, PRIMARY KEY (id));")
conn.execute("CREATE TABLE address (id INT NOT NULL AUTO_INCREMENT, user_id INT NOT NULL, email_address VARCHAR(50), PRIMARY KEY (id));")
conn.execute("INSERT INTO user (id, name) VALUES (1, 'Jeremy');")
conn.execute("INSERT INTO address (id, user_id, email_address) VALUES (1, 1, 'imsardine@gmail.com'), (2, 1, 'jeremykao@kkbox.com')")
trans.commit()
except:
trans.rollback()
raise
# Start a new transaction implicitly
Session = sessionmaker(bind=mysql_engine)
session = Session()
try:
# Test ORM querying 讀取沒問題
jeremy = session.query(User).filter_by(id=1).one()
assert jeremy.name == 'Jeremy'
assert [addr.email_address for addr in jeremy.addresses] == \
['imsardine@gmail.com', 'jeremykao@kkbox.com']
judy = User(name='Judy')
judy.addresses.append(Address(email_address='imjudykao@gmail.com'))
session.add(judy)
session.commit()
# Test ORM writing 寫入也沒問題
judy = session.query(User).filter_by(name='Judy').one()
assert [addr.email_address for addr in judy.addresses] == ['imjudykao@gmail.com']
assert judy.addresses[0].user is judy # backref
finally:
session.close()
過去我們對 "MySQL Cluster 不支援 FK constraints" 這句話可能有誤解,所以一開始就把 _user_id = Column('user_id', Integer, ForeignKey('user.id')) 中的 ForeignKey construct 拿掉,但根據 API 文件的說法,那是 relationship() 推導 table 間關聯方式的根據,結果就變成要手動在 relationship() 裡明確給定 foreign_keys、primaryjoin 等。
Building a Relationship - Object Relational Tutorial — SQLAlchemy 1.2 Documentation
... uses the foreign key relationships between the two tables to determine the nature of this linkage
這裡的 foreign key relationships 指的就是 ForeignKey() 或 ForeignKeyConstraint():
foreign_keys - Relationships API — SQLAlchemy 1.2 Documentation
In normal cases, the foreign_keys parameter is not required.
relationship()will automatically determine which columns in theprimaryjoinconditition are to be considered “foreign key” columns based on thoseColumnobjects that specifyForeignKey, or are otherwise listed as referencing columns in aForeignKeyConstraintconstruct.foreign_keysis only needed when:
很明顯的,我們一開始誤將 ForeignKey() construct 拿掉,因此要自訂 foreign_keys -- 但這通常都不用給,SQLAlchemy 會自己推算,跟 DB backend 支不支援 FK constraints 也沒有關係。
參考資料:
-
Building a Relationship - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
系統內一個 user 可以有多個 email address,也就是
users與addresses兩個 table 間有 one to many association;Mapping 可以這麼做:(建議對照 Define and Create Tables (Core) 一起看,很多設定是 Core 本來就有的,並非專屬於 ORM)>>> from sqlalchemy import ForeignKey >>> from sqlalchemy.orm import relationship >>> class Address(Base): ... __tablename__ = 'addresses' ... id = Column(Integer, primary_key=True) ... email_address = Column(String, nullable=False) ... user_id = Column(Integer, ForeignKey('users.id')) # Core 也是類似的寫法 ... ... user = relationship("User", back_populates="addresses") # 這是 ORM 才有的 ... ... def __repr__(self): ... return "<Address(email_address='%s')>" % self.email_address >>> User.addresses = relationship( # 可以 instrumentation 後才給? ... "Address", order_by=Address.id, back_populates="user") -
user_id = Column(Integer, ForeignKey('users.id'))中ForeignKey的意義是 "values in this column should be CONSTRAINED to be values present in the named remote column",就這個例子而言是指addresses.user_id的值一定要出現在uses.id裡;事實上這寫法在 Define and Create Tables (Core) 就有,只是在描述 schema/metadata,跟 ORM 沒有直接關係。 -
user = relationship("User", back_populates="addresses")就跟 ORM 有關,表示Address透過Address.user與User產生連結 (linkage),relationship()內部會依 foreign key relationships (ForeignKey/ForeignKeyConstraint) 決定連結的類型 -- 按照relationship()API 文件的說法,應該是決定foreign_keys。 -
relationship("User", back_populates="addresses")與relationship("Address", order_by=Address.id, back_populates="user")互相引用對方的 attribute name,形成所謂的 bidirectional relationship,這樣relationship()就會知道這其實是同一個 relationship,只是方向不同而已 (the same relationship as expressed in reverse)。
-
-
sqlalchemy.orm.relationship() - Relationships API — SQLAlchemy 1.2 Documentation #ril
foreign_keys- In normal cases, theforeign_keysparameter is NOT REQUIRED.relationship()will automatically determine which columns in theprimaryjoinconditition are to be considered “foreign key” columns based on thoseColumnobjects that specifyForeignKey, or are otherwise listed as referencing columns in aForeignKeyConstraintconstruct.foreign_keysis only needed when: 提到 "as there are multiple foreign key references present" 及 "TheTablebeing mapped does not actually haveForeignKeyorForeignKeyConstraintconstructs present, often because the table was REFLECTED from a database that does not support foreign key reflection (MySQL MyISAM)." 可見 foreign key relationships 指的正是ForeignKey/ForeignKeyConstraint;就算 backend 不支援 FK,只要手動補上ForeignKey/ForeignKeyConstraint即可?primaryjoin- By default, this value is computed based on the FOREIGN KEY RELATIONSHIPS of the parent and child tables (or association table). 不就是Column(..., ForeignKey())?
-
bidirectional relationship - Glossary — SQLAlchemy 1.2 Documentation #ril
-
Basic Relationship Patterns — SQLAlchemy 1.2 Documentation #ril
-
Working with Related Objects - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
Eager Loading - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
Deleting - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
Building a Many To Many Relationship - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
-
Relationship Configuration — SQLAlchemy 1.2 Documentation #ril
-
Linking Relationships with Backref — SQLAlchemy 1.2 Documentation #ril
-
Configuring how Relationship Joins — SQLAlchemy 1.2 Documentation #ril
- One To Many - Basic Relationship Patterns — SQLAlchemy 1.2 Documentation #ril
- Many To One - Basic Relationship Patterns — SQLAlchemy 1.2 Documentation #ril
- One To One - Basic Relationship Patterns — SQLAlchemy 1.2 Documentation #ril
- Building a Relationship - Object Relational Tutorial — SQLAlchemy 1.2 Documentation 最後 Did you know? 提到 FOREIGN KEY can refer to its own table. This is referred to as a “self-referential” foreign key. 這要如何 mapping?
- Many To Many - Basic Relationship Patterns — SQLAlchemy 1.2 Documentation #ril
- Assocation Object - Basic Relationship Patterns — SQLAlchemy 1.2 Documentation #ril
- 若同時揭露 ID 與 relationship,要存取 ID 是很方便,但更新 relationship 時 ID 會變嗎? 或者更新 ID 時 relationship 會變嗎?
- Defining a relationship without a foreign key constraint? - Google Groups 用
relation()#ril - How do I do a join without a real foreign key constraint? 用
relationship('Parent', primaryjoin='foreign(Child.parent_id) == remote(Parent.id)')搭配 SQLite in-memory database 來測。用create_engine(echo=True)可以觀察CREATE TABLE#ril - orm - sqlalchemy: create relations but without foreign key constraint in db? - Stack Overflow 不要定義
ForeignKeyconstraint,改用relationship()搭配foreign_keys與primaryjoin#ril - Defining Constraints and Indexes — SQLAlchemy 1.2 Documentation 是個 constraint #ril
- Configuring how Relationship Joins — SQLAlchemy 1.2 Documentation 說明
relationship()的用法 #ril - python - SQLAlchemy and joins, we have no foreign keys - Stack Overflow 方法之一是
relationship+primaryjoin#ril
- python - How do I correct this sqlalchemy.exc.NoForeignKeysError? - Stack Overflow 加
Column()裡加上ForeignKey()即可 #ril - 【已解决】Flask的SQLAlchemy出错:NoForeignKeysError Could not determine join condition between parent child tables on relationship – 在路上 (2016-12-13) #ril
- NoForeignKeysError - Core Exceptions — SQLAlchemy 1.2 Documentation Raised when no foreign keys can be located between two selectables during a join.
- Simple Validators - Changing Attribute Behavior — SQLAlchemy 1.3 Documentation #ril
-
A quick way to add a “validation” routine to an attribute is to use the
validates()decorator. An attribute validator can raise an exception, halting the process of mutating the attribute’s value, or can change the given value into something different. (像是 filtering) -
Validators, like all attribute extensions, are only called by normal USERLAND CODE; they are not issued when the ORM is populating the object: 這表示從資料庫載入資料時不會經過這一層,不會有多餘的運算。
from sqlalchemy.orm import validates class EmailAddress(Base): __tablename__ = 'address' id = Column(Integer, primary_key=True) email = Column(String) @validates('email') def validate_email(self, key, address): assert '@' in address return address
-
在生成第一個 mapping class 的 instance 時,觸發了下面的初始化:
# Initialize the inter-mapper relationships of all mappers that have been constructed thus far.
sqlalchemy.orm.mapper.configure_mappers()
sqlalchemy.orm.relationships._determine_joins()
問題或許就在 "inter-mapper relationships of all mappers",會引發下面的錯誤:
NoForeignKeysError: Could not determine join condition between parent/child tables on relationship Address.user - there are no foreign keys linking these tables. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or specify a 'primaryjoin' expression.
InvalidRequestError: One or more mappers failed to initialize - can't proceed with initialization of other mappers. Triggering mapper: 'Mapper|Address|address'. Original exception was: Could not determine join condition between parent/child tables on relationship Address.user - there are no foreign keys linking these tables. Ensure that referencing columns are associated with a ForeignKey or ForeignKeyConstraint, or specify a 'primaryjoin' expression.
其中 "One or more mappers failed to initialize" 看起來就滿合理的。
參考資料:
- python - Must two SQLAlchemy declarative models share the same declarative_base()? - Stack Overflow Denis Otkidach: 成功地在一個 session 中使用多個 declarative base,每個 base 有自己的 metadata,綁定不同的 database。 #ril
- Flask+SQLAlchemy with multiple dababases and shared models - Blog - BlaXpirit 提到
scoped_session(sessionmaker(engine))的用法,不過這裡是共用一個 declarative base #ril
- Querying - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
- Querying with Joins - Object Relational Tutorial — SQLAlchemy 1.2 Documentation #ril
文件: