SQLAlchemy 學習筆記（三）：ORM 中的關係構建

時間 2019-11-07

標籤 sqlalchemy 學習筆記 orm 關係構建欄目 Java 简体版

原文原文鏈接

我的筆記，不保證正確。html

關係構建：`ForeignKey` 與 `relationship`

關係構建的重點，在於搞清楚這兩個函數的用法。ForeignKey 的用法已經在 SQL表達式語言 - 表定義中的約束講過了。主要是 ondelete 和 onupdate 兩個參數的用法。python

`relationship`

relationship 函數在 ORM 中用於構建表之間的關聯關係。與 ForeignKey 不一樣的是，它定義的關係不屬於表定義，而是動態計算的。
用它定義出來的屬性，至關於 SQL 中的視圖。sql

這個函數有點難用，一是由於它的有幾個參數不太好理解，二是由於它的參數很是豐富，讓人望而卻步。下面經過一對多、多對一、多對多幾個場景下 relationship 的使用，來一步步熟悉它的用法。數據庫

首先初始化：flask

from sqlalchemy import Table, Column, Integer, ForeignKey
from sqlalchemy.orm import relationship
from sqlalchemy.ext.declarative import declarative_base

Base = declarative_base()

一對多

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)

    # 由於 Child 中有 Parent 的 ForeignKey，這邊的聲明不須要再額外指定什麼。
    children = relationship("Child")  # children 的集合，至關於一個視圖。

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

一個 Parent 能夠有多個 Children，經過 relationship，咱們就能直接經過 parent.children 獲得結果，免去繁瑣的 query 語句。session

反向引用

1. `backref` 與 `back_populates`

那若是咱們須要得知 child 的 parent 對象呢？能不能直接訪問 child.parent？ide

爲了實現這個功能，SQLAlchemy 提供了 backref 和 back_populates 兩個參數。函數

兩個參數的效果徹底一致，區別在於，backref 只須要在 Parent 類中聲明 children，Child.parent 會被動態建立。post

而 back_populates 必須在兩個類中顯式地使用 back_populates，更顯繁瑣。（可是也更清晰？）this

先看 backref 版：

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                            backref="parent")  # backref 表示，在 Child 類中動態建立 parent 屬性，指向當前類。

# Child 類不須要修改

再看 back_populates 版：

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent")  # back_populates 

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

    # 這邊也必須聲明，不能省略！
    parent = relationship("Parent", back_populates="children")  # parent 不是集合，是屬性！

NOTE：聲明的兩個 relationship 不須要多餘的說明，SQLAlchemy 能自動識別到 parent.children 是 collection，child.parent 是 attribute.

2. 反向引用的參數：`sqlalchemy.orm.backref(name, **kwargs)`

使用 back_populates 時，咱們能夠很方便地在兩個 relationship 函數中指定各類參數：

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child", back_populates="parent", 
                                        lazy='dynamic')  # 指定 lazy 的值

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship("Parent", back_populates="children", 
                                      lazy='dynamic')  # 指定 lazy 的值

可是若是使用 backref，由於咱們只有一個 relationship 函數，Child.parent 是被隱式建立的，咱們該如何指定這個屬性的參數呢？

答案就是 backref() 函數，使用它替代 backref 參數的值：

from sqlalchemy.orm import backref

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                            backref=backref("parent", lazy='dynamic'))  # 使用 backref() 函數，指定 Child.parent 屬性的參數

# Child 類不須要修改

backref() 的參數會被傳遞給 relationship()，所以它倆的參數也徹底一致。

多對一

A many-to-one is similar to a one-to-many relationship. The difference is that this relationship is looked at from the "many" side.

一對一

class Parent(Base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    child = relationship("Child", 
                                    uselist=False,   # 不使用 collection！這是關鍵
                                    back_populates="parent")

class Child(Base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    parent_id = Column(Integer, ForeignKey('parent.id'))

     # 包含 ForeignKey 的類，此屬性默認爲 attribute，所以不須要 uselist=False
    parent = relationship("Parent", back_populates="child")

多對多

# 多對多，必需要使用一個關聯表！
association_table = Table('association', Base.metadata,
    Column('left_id', Integer, ForeignKey('left.id')),  # 約定俗成的規矩，左邊是 parent
    Column('right_id', Integer, ForeignKey('right.id'))  # 右邊是 child
)

class Parent(Base):
    __tablename__ = 'left'
    id = Column(Integer, primary_key=True)
    children = relationship("Child",
                    secondary=association_table)  # 專用參數 secondary，用於指定使用的關聯表

class Child(Base):
    __tablename__ = 'right'
    id = Column(Integer, primary_key=True)

要添加反向引用時，一樣可使用 backref 或 back_populates.

user2user

若是多對多關係中的兩邊都是 user，即都是同一個表時，該怎麼聲明？

例如用戶的「關注」與「粉絲」，你是 user，你的粉絲是 user，你關注的帳號也是 user。

這個時候，關聯表 association_table 的兩個鍵都是 user，SQLAlchemy 沒法區分主次，須要手動指定，爲此須要使用 primaryjoin 和 secondaryjoin 兩個參數。

# 關聯表，左側的 user 正在關注右側的 user
followers = db.Table('followers',
    db.Column('follower_id', db.Integer, db.ForeignKey('user.id')),  # 左側
    db.Column('followed_id', db.Integer, db.ForeignKey('user.id'))  # 右側，被關注的 user
)

class User(UserMixin, db.Model):
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(64), index=True, unique=True, nullable=False)
    email = db.Column(db.String(120), index=True, unique=True, nullable=False)
    password_hash = db.Column(db.String(128), nullable=False)

    # 我關注的 users
    followed = db.relationship(
        'User',
        secondary=followers,  # 指定多對多關聯表
        primaryjoin=(followers.c.follower_id == id),  # 左側，用於獲取「我關注的 users」的 join 條件
        secondaryjoin=(followers.c.followed_id == id),  # 右側，用於獲取「個人粉絲」的 join 條件
        lazy='dynamic',  # 延遲求值，這樣才能用 filter_by 等過濾函數
        backref=db.backref('followers', lazy='dynamic'))  # followers 也要延遲求值

這裏比較繞的，就是容易搞混 primaryjoin 和 secondaryjoin 兩個參數。

primaryjoin：（多對多中）用於從子對象查詢其父對象的 condition（child.parents），默認只考慮外鍵。
secondaryjoin：（多對多中）用於從父對象查詢其全部子對象的 condition（parent.children），一樣的，默認狀況下只考慮外鍵。

ORM 層的「delete」 cascade vs. FOREIGN KEY 層的「ON DELETE」 cascade

以前有講過 Table 定義中的級聯操做：ON DELETE 和 ON UPDATE，能夠經過 ForeignKey 的參數指定爲 CASCADE.

可 SQLAlchemy 還有一個 relationship 生成 SQL 語句時的配置參數 cascade，另外 passive_deletes 也能夠指定爲 cascade。

有這麼多的 cascade，我真的是很懵。這三個 cascade 到底有何差異呢？

外鍵約束中的 ON DELETE 和 ON UPDATE，與 ORM 層的 CASCADE 在功能上，確實有不少重疊的地方。
可是也有不少不一樣：

數據庫層面的 ON DELETE 級聯能高效地處理 many-to-one 的關聯；咱們在 many 方定義外鍵，也在這裏添加 ON DELETE 約束。而在 ORM 層，就恰好相反。SQLAlchemy 在 one 方處理 many 方的刪除操做，這意味着它更適合處理 one-to-many 的關聯。
數據庫層面上，不帶 ON DELETE 的外鍵經常使用於防止父數據被刪除，而致使子數據成爲沒法被索引到的垃圾數據。若是要在一個 one-to-many 映射上實現這個行爲，SQLAlchemy 將外鍵設置爲 NULL 的默認行爲能夠經過如下兩種方式之一捕獲：
1. 最簡單也最經常使用的方法，固然是將外鍵定義爲 NOT NULL. 嘗試將該列設爲 NULL 會觸發 NOT NULL constraint exception.
2. 另外一種更特殊的方法，是將 passive_deletes 標誌設置爲字 all. 這會徹底禁用 SQLAlchemy 將外鍵列設置爲 NULL 的行爲，而且 DELETE 父數據而不會對子數據產生任何影響。這樣才能觸發數據庫層面的 ON DELETE 約束，或者其餘的觸發器。
3. 數據庫層面的 ON DELETE 級聯比 ORM 層面的級聯更高效。數據庫能夠同時在多個 relationship 中連接一系列級聯操做。
4. SQLAlchemy 不須要這麼複雜，由於咱們經過將 passive_deletes 選項與正確配置的外鍵約束結合使用，提供與數據庫的 ON DELETE 功能的平滑集成。

方法一：ORM 層的 cascade 實現

relationship 的 cascade 參數決定了修改父表時，何時子表要進行級聯操做。它的可選項有（str，選項之間用逗號分隔）：

save-update：默認選項之一。在 add（對應 SQL 的 insert 或 update）一個對象的時候，會 add 全部它相關聯的對象。
merge：默認選項之一。在 merge（至關字典的update操做，有就替換掉，沒有就合併）一個對象的時候，會 merge 全部和它相關聯的對象。
expunge ：移除操做的時候，會將相關聯的對象也進行移除。這個操做只是從session中移除，並不會真正的從數據庫中刪除。
delete：刪除父表數據時，同時刪除與它關聯的數據。
delete-orphan：當子對象與父對象解除關係時，刪除掉此子對象（孤兒）。（其實仍是沒懂。。）
refresh-expire：不經常使用。
all：表示選中除 delete-orphan 以外的全部選項。（所以 all, delete-orphan 很經常使用，它纔是真正的 all）

默認屬性是 "save-update, merge".

這只是簡略的說明，上述幾個參數的詳細文檔見 SQLAlchemy - Cascades

方法二：數據庫層的 cascade 實現

將 ForeignKey 的 ondelete 和 onupdate 參數指定爲 CASCADE，實現數據庫層面的級聯。
爲 relationship 添加關鍵字參數 passive_deletes="all"，這樣就徹底禁用 SQLAlchemy 將外鍵列設置爲 NULL 的行爲，而且 DELETE 父數據不會對子數據產生任何影響。

這樣 DELETE 操做時，就會觸發數據庫的 ON DELETE 約束，從而級聯刪除子數據。