《Oracle Database Concepts 11g Release 2》讀書筆記(3)...

時間 2019-12-08

標籤 oracle database concepts 11g release 讀書筆記欄目 Oracle 简体版

原文原文鏈接

Table Cluster (P50-P55)算法

1. Table Cluster定義sql

Table Cluster 是指一組 table 在一個相同的 block 裏共享相同的列並存儲相關的內容。當一個table被 cluster 後，一個block將包含不一樣table中的row。例如，一個block包含的row 同時在employees 和departments中，而不只僅在某一單獨表中。數據庫

Cluster Key 是指被Cluster的表的列，例如employees和 departments所共享的列department_id，能夠在建立Cluster時或將新的 table 添加到 Cluster時指定 Cluster Key。oracle

Cluster Key Value 指的是被指定爲 Cluster Key 的列的指定的行集的值。全部包含相同Cluster Key value 的數據，其物理存儲都是在一塊兒的。不管有多少個表的行集包含該值，每一個cluster Key Value 在 cluster 或 cluster index 中只存儲一次。優化

若是有好幾個表被常常用於查詢（特別是多表查詢或join查詢）查詢，那就應該考慮使用 cluster table，由於Table Cluster將不一樣表中的相關行存儲在相同的block裏，適當的使用 cluster table能夠帶來如下好處：spa

1）減小clustered tables 之間進行join查詢的磁盤 I/O.net

2）減小 clustered tables之間進行join查詢的讀取時間code

3）更少的存儲空間，由於 cluster key value 只會存儲一次排序

Cluster table 在如下狀況時不該被使用：ip

1）表的多數訪問都是單獨查詢

2）表常常被 update

3）表常常須要 full table scan

4）表須要 truncating

2. Indexed Cluster 定義

Indexed Cluster 是一個使用index來定位數據的table cluster，是一個創建在Cluster Key 上的B-tree Index。Cluster Index必須在 table cluster 填入數據前創建。

以下例，以department_id 爲cluster key來創建名爲 employees_departments_cluster的cluster，因爲定義的時候未聲明爲 HASHKEYS cluster，因此這個cluster是一個indexted cluster；接下來，咱們在這個cluster key上建一個名字爲idx_emp_dept_cluster的index。

例：

CRATE CLUSTER employees_departments_cluster

   (department_id NUMBER(4))

SIZE 512;

 

CREATE INDEX idx_emp_dept_cluster ON CLUSTER employees_departments_cluster;

接下來在cluster上創建employees和departments兩個表，同時聲明departnemt_id列爲cluster key
例：

CREATE TABLE employees(…)

         CLUSTER employees_departments_cluster(department_id);

 

CREATE TABLE departments(…)

CLUSTER employees_departments_cluster(department_id);

最後,在你往employees和departments裏添加數據時,數據庫會將employees和departments兩個表中的每一個department對應的全部行存儲在相同的data block裏。這些行被存儲在heap裏並以index來進行定位。

下圖展現了employees_departments_cluster的存儲結構,數據庫將department爲20和110的employee存儲到一塊兒.

B-tree Cluster Index 是以存儲數據的block的物理地址來關聯cluster key value.例如,如下地址:
20,AADAAAA9d
表明了存儲department 20中的employee的block的地址.

Cluster Index 是被單獨管理的,與nonclustered table上創建的index同樣,並能夠與table cluster存在於不一樣表空間之中.

若是employees和departments兩個表沒有定義爲table cluster,那麼數據庫將不能保證這

相關聯的行會被存儲到一塊兒.以下圖所示:

3.Hash Clusters定義

Hash Cluster與indexed cluster相似,只不過index key被hash function所替換,沒有單獨存在的cluster index.在一個hash cluster中,數據就是index.

Hash cluster的key與indexted cluster的key同樣,都是一個單一列或組合關鍵字段.oracle database根據特定的cluster key values，使用一個hash function來產生一系列被稱爲hash key 的整數.數據庫將cluster key hash到一個data block的物理地址.數據庫將有相同key value的行存儲到一塊兒.

在一個indexed table或index cluster中,oracle使用存放在一個獨立index中的key value來定位數據庫中的行.在indexed table或indexed cluster中查找或存儲一行,最少須要通過兩次I/O:

1) 至少一次I/O來在index中查找key value或在index中存儲key value

2) 一次I/O來讀或寫table或cluster中的行

爲了在hash cluster中查找或存儲一行數據,oracle爲每行的cluster key value提供了hash function. Oracle數據庫 Hash function的計算結果對應到cluster中的data block, 並對其進行讀寫.

Hashing是在存儲數據時用於提升數據檢索速度的一種方法,當如下條件知足時,能夠考慮使用hash cluster:

1) 一個表被用於query多於modify

2) hash key所在列常常被使用等於關係進行查詢,如 WHERE department_id=20. 對於這個查詢,若是 cluster key value已經hash,那麼hash key value將直接指向存儲相應行的block.

3) 一個表中的行數是能夠被合理的計算出來的(用於定義 hash function)

4. Hash cluster creation

創建一個hash cluster時,除了使用 CREATE CLUSTER來建立indexed cluster外,還須要添加HASHKEY關鍵字,以下例:

CREATE CLUSTER employees_departments_cluster

(department_id NUMBER(4))

SIZE 8192 HASHKEYS 100;

以上, department_id被定義爲hash key,在這個例子中HASHKEY聲明瞭department有可能的數目(通常部門數都是能夠計算出來的,也就是說表中的行數是能夠計算出來的,知足上面的第3個條件).

在這個方案中,用戶常常執行的查詢以下所示,經過輸入不一樣的p_id來查詢不一樣的department ID

例:

SELECT * FROM employees

WHERE department_id = :pid

 

SELECT * FROM departments

WHERE department_id = :pid

 

SELECT * FROM departments d, employees e

WHERE e.department_id=d.department_id

AND d.department_id = :pid

假設用戶常常以department_id爲20來執行第一個查詢, oracle數據庫使用20來當成hash function的輸入參數,並定位到存儲全部在編號爲20的department中的employee的block.

上圖將一個hash cluster segment以一行blocks的形式展現出來,因而可知,每次數據檢索都只須要一次I/O.

Hash Cluster的侷限在於:

1) 其不適用於在nonindexed cluster key上進行range scan(區間搜索)

例:

CREATE CLUSTER employees_departments_cluster

   (department_id NUMBER(4))

SIZE 8192 HASHKEYS 100;

若是以上代碼建立的hash cluster上不存在獨立的index,那麼查詢位於20至100的department_id將不能使用hash算法,由於他不能對20至100之間的每個可能值進行hash.

由於不存在index,則數據庫須要進行full scan.

5. Hash Cluster Variations

Single-table hash cluster是一個優化過的,僅支持一個table的hash cluster.在這裏,從hash key 到行的映射是一一對應的,當用戶須要對單表經過主鍵進行快速訪問時,使用sing-table hash cluster是合適的.例如,用戶常常會在employees表裏經過employee_id查找employee相關記錄.

Sorted hash cluster是hash cluster的一個變種,其內的全部與hash結果相一致的行都已根據指定的列進行升序排序. Sorted hash cluster 容許應用程序對數據進行快速檢索,由於數據在插入時已經排好序.例如,一個包含orders表的hash cluster 能夠根據 order_date 進行排序.

6. Hash Cluster Storage

Oracle Database對hash cluster的空間分配與index cluster是不同的.database根據建立cluster的語句裏的SIZE和HASHKEYS的乘積得出一個結果.並以字節爲單位預分配與此結果一致的空間.

例:

CREATE CLUSTER employees_departments_cluster

   (department_id NUMBER(4))

SIZE 8192 HASHKEYS 100;

在上例中

,HASHKEYS聲明瞭有可能存在的department數,SIZE聲明瞭每一個department全部數據所佔的空間大小.

在一個hash cluster中,HASHKEYS的值是固定的.Oracle database並不會根據HASHKEYS來限制在table中能夠插入的值,但若是插入的數據遠大玩HASHKEYS的值,,對hash cluster的檢索效率就會降低,這時應該使用新的HASHKEYS來重建hash cluster.

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。