django 1.8 官方文檔翻譯： 2-5-6 多數據庫

時間 2019-12-04

原文原文鏈接

多數據庫

這篇主題描述Django 對多個數據庫的支持。大部分Django 文檔假設你只和一個數據庫打交道。若是你想與多個數據庫打交道，你將須要一些額外的步驟。html

定義你的數據庫

在Django中使用多個數據庫的第一步是告訴Django 你將要使用的數據庫服務器。這經過使用DATABASES 設置完成。該設置映射數據庫別名到一個數據庫鏈接設置的字典，這是整個Django 中引用一個數據庫的方式。字典中的設置在 DATABASES 文檔中有完整描述。python

你能夠爲數據庫選擇任何別名。然而，default這個別名具備特殊的含義。當沒有選擇其它數據庫時，Django 使用具備default 別名的數據庫。mysql

下面是settings.py的一個示例片斷，它定義兩個數據庫 —— 一個默認的PostgreSQL 數據庫和一個叫作users的MySQL 數據庫：sql

DATABASES = {
    'default': {
        'NAME': 'app_data',
        'ENGINE': 'django.db.backends.postgresql_psycopg2',
        'USER': 'postgres_user',
        'PASSWORD': 's3krit'
    },
    'users': {
        'NAME': 'user_data',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_user',
        'PASSWORD': 'priv4te'
    }
}

若是default 數據庫在你的項目中不合適，你須要當心地永遠指定是想使用的數據庫。Django 要求default 數據庫必須定義，可是其參數字典能夠保留爲空若是不使用它。若要這樣作，你必須爲你的全部的應用的模型創建DATABASE_ROUTERS，包括正在使用的contrib 中的應用和第三方應用，以使得不會有查詢被路由到默認的數據庫。下面是settings.py 的一個示例片斷，它定義兩個非默認的數據庫，其中default 有意保留爲空：數據庫

DATABASES = {
    'default': {},
    'users': {
        'NAME': 'user_data',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_user',
        'PASSWORD': 'superS3cret'
    },
    'customers': {
        'NAME': 'customer_data',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_cust',
        'PASSWORD': 'veryPriv@ate'
    }
}

若是你試圖訪問在DATABASES 設置中沒有定義的數據庫，Django 將拋出一個django.db.utils.ConnectionDoesNotExist異常。django

同步你的數據庫

migrate 管理命令一次操做一個數據庫。默認狀況下，它在default 數據庫上操做，可是經過提供一個--database 參數，你能夠告訴migrate 同步一個不一樣的數據庫。所以，爲了同步全部模型到咱們示例中的全部數據庫，你將須要調用：服務器

$ ./manage.py migrate
$ ./manage.py migrate --database=users

若是你不想每一個應用都被同步到同一臺數據庫上，你能夠定義一個數據庫路由，它實現一個策略來控制特定模型的訪問性。session

使用其它管理命令

其它django-admin 命令與數據庫交互的方式與migrate相同 —— 它們都一次只操做一個數據庫，並使用--database來控制使用的數據庫。架構

數據庫自動路由

使用多數據庫最簡單的方法是創建一個數據庫路由模式。默認的路由模式確保對象’粘滯‘在它們原始的數據庫上（例如，從foo 數據庫中獲取的對象將保存在同一個數據庫中）。默認的路由模式還確保若是沒有指明數據庫，全部的查詢都回歸到default數據庫中。app

你不須要作任何事情來激活默認的路由模式 —— 它在每一個Django項目上’直接‘提供。然而，若是你想實現更有趣的數據庫分配行爲，你能夠定義並安裝你本身的數據庫路由。

數據庫路由

數據庫路由是一個類，它提供4個方法：

db_for_read(model, **hints)

建議model類型的對象的讀操做應該使用的數據庫。

若是一個數據庫操做可以提供其它額外的信息能夠幫助選擇一個數據庫，它將在hints字典中提供。合法的hints 的詳細信息在下文給出。

若是沒有建議，則返回None。

db_for_write(model, **hints)

建議Model 類型的對象的寫操做應該使用的數據庫。

若是一個數據庫操做可以提供其它額外的信息能夠幫助選擇一個數據庫，它將在hints字典中提供。合法的hints 的詳細信息在下文給出。

若是沒有建議，則返回None。

allow_relation(obj1, obj2, **hints)

若是obj1 和obj2 之間應該容許關聯則返回True，若是應該防止關聯則返回False，若是路由沒法判斷則返回None。這是純粹的驗證操做，外鍵和多對多操做使用它來決定兩個對象之間是否應該容許一個關聯。

allow_migrate(db, app_label, model_name=None, **hints)

定義遷移操做是否容許在別名爲db的數據庫上運行。若是操做應該運行則返回True ，若是不該該運行則返回False，若是路由沒法判斷則返回None。

位置參數app_label 是正在遷移的應用的標籤。

大部分遷移操做設置model_name的值爲正在遷移的模型的model._meta.model_name（模型的__name__ 的小寫）。對於RunPython和RunSQL 操做它的值爲None，除非這兩個操做使用hint 提供它。

hints 用於某些操做來傳遞額外的信息給路由。

當設置了model_name時，hints 一般經過鍵'model'包含該模型的類。注意，它多是一個歷史模型，所以不會有自定的屬性、方法或管理器。你應該只依賴_meta。

這個方法還能夠用來決定一個給定數據庫上某個模型的可用性。

注意，若是這個方法返回False，遷移將默默地不會在模型上作任何操做。這可能致使你應用某些操做以後出現損壞的外鍵、表多餘或者缺失。

Changed in Django 1.8:

The signature of allow_migrate has changed significantly from previous versions. See the deprecation notes for more details.

路由沒必要提供全部這些方法 —— 它能夠省略一個或多個。若是某個方法缺失，在作相應的檢查時Django 將忽略該路由。

Hints

Hint 由數據庫路由接收，用於決定哪一個數據庫應該接收一個給定的請求。

目前，惟一一個提供的hint 是instance，它是一個對象實例，與正在進行的讀或者寫操做關聯。This might be the instance that is being saved, or it might be an instance that is being added in a many-to-many relation. In some cases, no instance hint will be provided at all. The router checks for the existence of an instance hint, and determine if that hint should be used to alter routing behavior.

使用路由

數據庫路由使用DATABASE_ROUTERS 設置安裝。這個設置定義一個類名的列表，其中每一個類表示一個路由，它們將被主路由（django.db.router）使用。

Django 的數據庫操做使用主路由來分配數據庫的使用。每當一個查詢須要知道使用哪個數據庫時，它將調用主路由，並提供一個模型和一個Hint （可選）。Django 而後依次測試每一個路由直至找到一個數據庫的建議。若是找不到建議，它將嘗試Hint 實例的當前_state.db。若是沒有提供Hint 實例，或者該實例當前沒有數據庫狀態，主路由將分配default 數據庫。

一個例子

只是爲了示例！

這個例子的目的是演示如何使用路由這個基本結構來改變數據庫的使用。它有意忽略一些複雜的問題，目的是爲了演示如何使用路由。

若是myapp 中的任何一個模型包含與其它數據庫以外的模型的關聯，這個例子將不能工做。跨數據的關聯引入引用完整性問題，Django目前還沒法處理。

Primary/replica（在某些數據庫中叫作master/slave）配置也是有缺陷的 —— 它不提供任何處理Replication lag 的解決辦法（例如，由於寫入同步到replica 須要必定的時間，這會引入查詢的不一致）。It also doesn’t consider the interaction of transactions with the database utilization strategy.

那麼 —— 在實際應用中這覺得着什麼？讓咱們看一下另一個配置的例子。這個配置將有幾個數據庫：一個用於auth 應用，全部其它應用使用一個具備兩個讀replica 的 primary/replica。下面是表示這些數據庫的設置：

DATABASES = {
    'auth_db': {
        'NAME': 'auth_db',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_user',
        'PASSWORD': 'swordfish',
    },
    'primary': {
        'NAME': 'primary',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_user',
        'PASSWORD': 'spam',
    },
    'replica1': {
        'NAME': 'replica1',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_user',
        'PASSWORD': 'eggs',
    },
    'replica2': {
        'NAME': 'replica2',
        'ENGINE': 'django.db.backends.mysql',
        'USER': 'mysql_user',
        'PASSWORD': 'bacon',
    },
}

如今咱們將須要處理路由。首先，咱們須要一個路由，它知道發送auth 應用的查詢到auth_db：

class AuthRouter(object):
    """
    A router to control all database operations on models in the
    auth application.
    """
    def db_for_read(self, model, **hints):
        """
        Attempts to read auth models go to auth_db.
        """
        if model._meta.app_label == 'auth':
            return 'auth_db'
        return None

    def db_for_write(self, model, **hints):
        """
        Attempts to write auth models go to auth_db.
        """
        if model._meta.app_label == 'auth':
            return 'auth_db'
        return None

    def allow_relation(self, obj1, obj2, **hints):
        """
        Allow relations if a model in the auth app is involved.
        """
        if obj1._meta.app_label == 'auth' or \
           obj2._meta.app_label == 'auth':
           return True
        return None

    def allow_migrate(self, db, app_label, model=None, **hints):
        """
        Make sure the auth app only appears in the 'auth_db'
        database.
        """
        if app_label == 'auth':
            return db == 'auth_db'
        return None

咱們還須要一個路由，它發送全部其它應用的查詢到primary/replica 配置，並隨機選擇一個replica 來讀取：

import random

class PrimaryReplicaRouter(object):
    def db_for_read(self, model, **hints):
        """
        Reads go to a randomly-chosen replica.
        """
        return random.choice(['replica1', 'replica2'])

    def db_for_write(self, model, **hints):
        """
        Writes always go to primary.
        """
        return 'primary'

    def allow_relation(self, obj1, obj2, **hints):
        """
        Relations between objects are allowed if both objects are
        in the primary/replica pool.
        """
        db_list = ('primary', 'replica1', 'replica2')
        if obj1._state.db in db_list and obj2._state.db in db_list:
            return True
        return None

    def allow_migrate(self, db, app_label, model=None, **hints):
        """
        All non-auth models end up in this pool.
        """
        return True

最後，在設置文件中，咱們添加以下內容（替換path.to.爲該路由定義所在的真正路徑）：

DATABASE_ROUTERS = ['path.to.AuthRouter', 'path.to.PrimaryReplicaRouter']

路由處理的順序很是重要。路由的查詢將按照DATABASE_ROUTERS設置中列出的順序進行。在這個例子中，AuthRouter在PrimaryReplicaRouter以前處理，所以auth中的模型的查詢處理在其它模型以前。若是DATABASE_ROUTERS設置按其它順序列出這兩個路由，PrimaryReplicaRouter.allow_migrate() 將先處理。PrimaryReplicaRouter 中實現的捕獲全部的查詢，這意味着全部的模型能夠位於全部的數據庫中。

創建這個配置後，讓咱們運行一些Django 代碼：

>>> # This retrieval will be performed on the 'auth_db' database
>>> fred = User.objects.get(username='fred')
>>> fred.first_name = 'Frederick'

>>> # This save will also be directed to 'auth_db'
>>> fred.save()

>>> # These retrieval will be randomly allocated to a replica database
>>> dna = Person.objects.get(name='Douglas Adams')

>>> # A new object has no database allocation when created
>>> mh = Book(title='Mostly Harmless')

>>> # This assignment will consult the router, and set mh onto
>>> # the same database as the author object
>>> mh.author = dna

>>> # This save will force the 'mh' instance onto the primary database...
>>> mh.save()

>>> # ... but if we re-retrieve the object, it will come back on a replica
>>> mh = Book.objects.get(title='Mostly Harmless')

手動選擇一個數據庫

Django 還提供一個API，容許你在你的代碼中徹底控制數據庫的使用。人工指定的數據庫的優先級高於路由分配的數據庫。

爲QuerySet手動選擇一個數據庫

你能夠在QuerySet「鏈」的任意節點上爲QuerySet選擇數據庫。只須要在QuerySet上調用using()就可讓QuerySet使用一個指定的數據庫。

using() 接收單個參數：你的查詢想要運行的數據庫的別名。例如：

>>> # This will run on the 'default' database.
>>> Author.objects.all()

>>> # So will this.
>>> Author.objects.using('default').all()

>>> # This will run on the 'other' database.
>>> Author.objects.using('other').all()

爲save() 選擇一個數據庫

對Model.save()使用using 關鍵字來指定數據應該保存在哪一個數據庫。

例如，若要保存一個對象到legacy_users 數據庫，你應該使用：

>>> my_object.save(using='legacy_users')

若是你不指定using，save()方法將保存到路由分配的默認數據庫中。

將對象從一個數據庫移動到另外一個數據庫

若是你已經保存一個實例到一個數據庫中，你可能很想使用save(using=...) 來遷移該實例到一個新的數據庫中。然而，若是你不使用正確的步驟，這可能致使意外的結果。

考慮下面的例子：

>>> p = Person(name='Fred')
>>> p.save(using='first')  # (statement 1)
>>> p.save(using='second') # (statement 2)

在statement 1中，一個新的Person 對象被保存到 first 數據庫中。此時p 沒有主鍵，因此Django 發出一個SQL INSERT 語句。這會建立一個主鍵，且Django 將此主鍵賦值給p。

當保存在statement 2中發生時，p已經具備一個主鍵，Django 將嘗試在新的數據庫上使用該主鍵。若是該主鍵值在second 數據庫中沒有使用，那麼你不會遇到問題 —— 該對象將被複制到新的數據庫中。

然而，若是p 的主鍵在second數據庫上已經在使用second 數據庫中的已經存在的對象將在p保存時被覆蓋。

你能夠用兩種方法避免這種狀況。首先，你能夠清除實例的主鍵。若是一個對象沒有主鍵，Django 將把它當作一個新的對象，這將避免second數據庫上數據的丟失：

>>> p = Person(name='Fred')
>>> p.save(using='first')
>>> p.pk = None # Clear the primary key.
>>> p.save(using='second') # Write a completely new object.

第二種方法是使用force_insert 選項來save()以確保Django 使用一個INSERT SQL：

>>> p = Person(name='Fred')
>>> p.save(using='first')
>>> p.save(using='second', force_insert=True)

這將確保名稱爲Fred 的Person在兩個數據庫上具備相同的主鍵。在你試圖保存到second數據庫，若是主鍵已經在使用，將會引拋出發一個錯誤。

選擇一個數據庫用於刪除表單

默認狀況下，刪除一個已存在對象的調用將在與獲取對象時使用的相同數據庫上執行：

>>> u = User.objects.using('legacy_users').get(username='fred')
>>> u.delete() # will delete from the `legacy_users` database

要指定刪除一個模型時使用的數據庫，能夠對Model.delete()方法使用using 關鍵字參數。這個參數的工做方式與save()的using關鍵字參數同樣。

例如，你正在從legacy_users 數據庫到new_users 數據庫遷移一個User ，你可使用這些命令：

>>> user_obj.save(using='new_users')
>>> user_obj.delete(using='legacy_users')

多個數據庫上使用管理器

在管理器上使用db_manager()方法來讓管理器訪問非默認的數據庫。

例如，你有一個自定義的管理器方法，它訪問數據庫時候用 ——User.objects.create_user()。由於create_user()是一個管理器方法，不是一個QuerySet方法，你不可使用User.objects.using('new_users').create_user()。（create_user() 方法只能在User.objects上使用，而不能在從管理器獲得的QuerySet上使用）。解決辦法是使用db_manager()，像這樣：

User.objects.db_manager('new_users').create_user(...)

db_manager() 返回一個綁定在你指定的數據上的一個管理器。

多數據庫上使用get_queryset()

若是你正在覆蓋你的管理器上的get_queryset()，請確保在其父類上調用方法（使用super()）或者正確處理管理器上的_db屬性（一個包含將要使用的數據庫名稱的字符串）。

例如，若是你想從get_queryset 方法返回一個自定義的 QuerySet 類，你能夠這樣作：

class MyManager(models.Manager):
    def get_queryset(self):
        qs = CustomQuerySet(self.model)
        if self._db is not None:
            qs = qs.using(self._db)
        return qs

Django 的管理站點中使用多數據庫

Django 的管理站點沒有對多數據庫的任何顯式的支持。若是你給數據庫上某個模型提供的管理站點不想經過你的路由鏈指定，你將須要編寫自定義的ModelAdmin類用來將管理站點導向一個特殊的數據庫。

ModelAdmin 對象具備5個方法，它們須要定製以支持多數據庫：

class MultiDBModelAdmin(admin.ModelAdmin):
    # A handy constant for the name of the alternate database.
    using = 'other'

    def save_model(self, request, obj, form, change):
        # Tell Django to save objects to the 'other' database.
        obj.save(using=self.using)

    def delete_model(self, request, obj):
        # Tell Django to delete objects from the 'other' database
        obj.delete(using=self.using)

    def get_queryset(self, request):
        # Tell Django to look for objects on the 'other' database.
        return super(MultiDBModelAdmin, self).get_queryset(request).using(self.using)

    def formfield_for_foreignkey(self, db_field, request=None, **kwargs):
        # Tell Django to populate ForeignKey widgets using a query
        # on the 'other' database.
        return super(MultiDBModelAdmin, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs)

    def formfield_for_manytomany(self, db_field, request=None, **kwargs):
        # Tell Django to populate ManyToMany widgets using a query
        # on the 'other' database.
        return super(MultiDBModelAdmin, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs)

這裏提供的實現實現了一個多數據庫策略，其中一個給定類型的全部對象都將保存在一個特定的數據庫上（例如，全部的User保存在other 數據庫中）。若是你的多數據庫的用法更加複雜，你的ModelAdmin將須要反映相應的策略。

Inlines 能夠用類似的方式處理。它們須要3個自定義的方法：

class MultiDBTabularInline(admin.TabularInline):
    using = 'other'

    def get_queryset(self, request):
        # Tell Django to look for inline objects on the 'other' database.
        return super(MultiDBTabularInline, self).get_queryset(request).using(self.using)

    def formfield_for_foreignkey(self, db_field, request=None, **kwargs):
        # Tell Django to populate ForeignKey widgets using a query
        # on the 'other' database.
        return super(MultiDBTabularInline, self).formfield_for_foreignkey(db_field, request=request, using=self.using, **kwargs)

    def formfield_for_manytomany(self, db_field, request=None, **kwargs):
        # Tell Django to populate ManyToMany widgets using a query
        # on the 'other' database.
        return super(MultiDBTabularInline, self).formfield_for_manytomany(db_field, request=request, using=self.using, **kwargs)

一旦你寫好你的模型管理站點的定義，它們就可使用任何Admin實例來註冊：

from django.contrib import admin

# Specialize the multi-db admin objects for use with specific models.
class BookInline(MultiDBTabularInline):
    model = Book

class PublisherAdmin(MultiDBModelAdmin):
    inlines = [BookInline]

admin.site.register(Author, MultiDBModelAdmin)
admin.site.register(Publisher, PublisherAdmin)

othersite = admin.AdminSite('othersite')
othersite.register(Publisher, MultiDBModelAdmin)

這個例子創建兩個管理站點。在第一個站點上，Author 和 Publisher 對象被暴露出來；Publisher 對象具備一個表格的內聯，顯示該出版社出版的書籍。第二個站點只暴露Publishers，而沒有內聯。

多數據庫上使用原始遊標

若是你正在使用多個數據庫，你可使用django.db.connections來獲取特定數據庫的鏈接（和遊標）：django.db.connections是一個類字典對象，它容許你使用別名來獲取一個特定的鏈接：

from django.db import connections
cursor = connections['my_db_alias'].cursor()

多數據庫的侷限

跨數據庫關聯

Django 目前不提供跨多個數據庫的外鍵或多對多關係的支持。若是你使用一個路由來路由分離到不一樣的數據庫上，這些模型定義的任何外鍵和多對多關聯必須在單個數據庫的內部。

這是由於引用完整性的緣由。爲了保持兩個對象之間的關聯，Django 須要知道關聯對象的主鍵是合法的。若是主鍵存儲在另一個數據庫上，判斷一個主鍵的合法性不是很容易。

若是你正在使用Postgres、Oracle或者MySQ 的InnoDB，這是數據庫完整性級別的強制要求 —— 數據庫級別的主鍵約束防止建立不能驗證合法性的關聯。

然而，若是你正在使用SQLite 或MySQL的MyISAM 表，則沒有強制性的引用完整性；結果是你能夠‘僞造’跨數據庫的外鍵。可是Django 官方不支持這種配置。

Contrib 應用的行爲

有幾個Contrib 應用包含模型，其中一些應用相互依賴。由於跨數據庫的關聯是不可能的，這對你如何在數據庫之間劃分這些模型帶來一些限制：

contenttypes.ContentType、sessions.Session和sites.Site 能夠存儲在分開存儲在不一樣的數據庫中，只要給出合適的路由
auth模型 —— User、Group和Permission —— 關聯在一塊兒並與ContentType關聯，因此它們必須與ContentType存儲在相同的數據庫中。
admin依賴auth，因此它們的模型必須與auth在同一個數據庫中。
flatpages和redirects依賴sites，因此它們必須與sites在同一個數據庫中。