Oracle-動態採樣

Oracle默認認爲SQL語句的where條件中出現的各列彼此是獨立的,互不影響;因此oracle認爲目標sql謂詞條件以AND來組合的話,其選擇率就是各個謂詞條件的乘積,進而能夠評估整個sql語句返回結果集的Cardinality。sql

例如:oracle

select count(1) from people where sex='男' and birth_month=‘9月’;dom

選擇率:性別只有男女,因此選擇率是1/2;ide

月份只有12個月,因此選擇率是1/12;this

即 1/2*1/12=1/24。3d

若是表people有10000記錄,那麼基數Cardinality=100000*(1/2*1/12)=417,這樣評估基數是沒有問題的,由於 sex,birth_month是沒有關係的。orm

再來看以下一個例子:blog

select count(1) from people where birth_month='9月' and constellation=‘處女座’;ip

若是仍是按照如上的方法計算其選擇率就有問題,由於月份跟星座是有關係的,並不能認爲謂詞各列是獨立的,在這種狀況下咱們就須要動態採樣機制來避免這種問題。hash

 

動態採樣有以下做用:

一、無論目標sql語句中各列有什麼關係,在大多數狀況下,CBO均可以相對準確的評估出這個謂詞條件各列的關係進而能夠正確的評估出選擇率,獲得正確的返回基數。

二、因爲有一些應用在執行的過程當中會建立一些臨時表,而後會往這些臨時表裏插入一些中間數據用於查詢等操做,因爲在執行完成以後這些臨時表會被刪除,問題是,這些臨時表一旦參入了查詢等操做的時候,他們是沒有統計信息的,這種狀況下,可能會致使最終的評估結果不許確甚至是錯誤的,動態採樣能夠避免這種狀況。

動態採樣開啓方式:

一、optimizer_dynamic_sampling參數值大於或者等於1,則表明動態採樣功能開啓。

二、使用hint強制動態採樣,dynamic_sampling(t,level) 表示對目標表T強制使用等級爲level的動態採樣。

 

optimizer_dynamic_sampling參數取值範圍介紹:

對於動態採樣hint ,dynamic_sampling(t,level),其中level取值範圍:

模擬環境:

SQL> create table t2 (c1 varchar2(1),c2 varchar2(2000),n1 number,n2 number);

 

SQL> insert into t2 select 'a','a',trunc(dbms_random.value(0,20)),trunc(dbms_random.value(0,25)) from dba_objects where rownum<10001;

SQL> create index idx_t2 on t2(n1,n2,c1);

SQL> exec dbms_stats.gather_table_stats('Sh','T2',cascade=>true);

 

PL/SQL procedure successfully completed.

 

SQL> @/home/oracle/scripts/sosi.txt;

Table Number Empty Average Chain Average Global User Sample Date

Name of Rows Blocks Blocks Space Count Row Len Stats Stats Size MM-DD-YYYY

--------------- -------------- -------- ------------ ------- -------- ------- ------ ------ -------------- ----------

T2 10,000 28 0 0 0 10 YES NO 10,000 05-04-2018

 

Column Column Distinct Number Number Global User Sample Date

Name Details Values Density Buckets Nulls Stats Stats Size MM-DD-YYYY

------------------------- ------------------------ ------------ ------- ------- ---------- ------ ------ -------------- ----------

C1 VARCHAR2(1) 1 1 1 0 YES NO 10,000 05-04-2018

C2 VARCHAR2(2000) 1 1 1 0 YES NO 10,000 05-04-2018

N1 NUMBER(22) 20 0 1 0 YES NO 10,000 05-04-2018

N2 NUMBER(22) 25 0 1 0 YES NO 10,000 05-04-2018

 

計算其基數:10000*1/1*1/1*1/20*1/25=20;

 

SQL> select count(1) from t2 where n1=1 and n2=3 and c1='a'

 

COUNT(1)

----------

14

查看執行計劃:

SQL> select * from table(dbms_xplan.display_cursor(null,0));

 

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 3dant16ufywy5, child number 0

-------------------------------------

select count(1) from t2 where n1=1 and n2=3 and c1='a'

 

Plan hash value: 4191549303

 

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | 1 (100)| |

| 1 | SORT AGGREGATE | | 1 | 8 | | |

|* 2 | INDEX RANGE SCAN| IDX_T2 | 20 | 160 | 1 (0)| 00:00:01 |

----------------------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

2 - access("N1"=1 AND "N2"=3 AND "C1"='a')

執行計劃返回rows爲20,咱們計算的基數爲120,實際返回的爲14,可見跟咱們預算出來值基本吻合。

 

SQL> update t2 set n2=n1;

SQL> exec dbms_stats.gather_table_stats('Sh','T2',cascade=>true);

SQL> @/home/oracle/scripts/sosi.txt;

 

Table Number Empty Average Chain Average Global User Sample Date

Name of Rows Blocks Blocks Space Count Row Len Stats Stats Size MM-DD-YYYY

--------------- -------------- -------- ------------ ------- -------- ------- ------ ------ -------------- ----------

T2 10,000 28 0 0 0 10 YES NO 10,000 05-04-2018

 

Column Column Distinct Number Number Global User Sample Date

Name Details Values Density Buckets Nulls Stats Stats Size MM-DD-YYYY

------------------------- ------------------------ ------------ ------- ------- ---------- ------ ------ -------------- ----------

C1 VARCHAR2(1) 1 0 1 0 YES NO 10,000 05-04-2018

C2 VARCHAR2(2000) 1 1 1 0 YES NO 10,000 05-04-2018

N1 NUMBER(22) 20 0 20 0 YES NO 10,000 05-04-2018

N2 NUMBER(22) 20 0 20 0 YES NO 10,000 05-04-2018

若是仍是按照以往的計算方式,基數爲 :10000*1/20*1/20=25

 

SQL> select count(1) from t2 where n1=3 and n2=3 and c1='a';

 

COUNT(1)

----------

498

實際返回結果集是498,

 

SQL> select * from table(dbms_xplan.display_cursor(null,0));

 

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID 1tpvtscuhy3b6, child number 0

-------------------------------------

select count(1) from t2 where n1=3 and n2=3 and c1='a'

 

Plan hash value: 4191549303

 

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | 1 (100)| |

| 1 | SORT AGGREGATE | | 1 | 8 | | |

|* 2 | INDEX RANGE SCAN| IDX_T2 | 25 | 200 | 1 (0)| 00:00:01 |

----------------------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

2 - access("N1"=3 AND "N2"=3 AND "C1"='a')

查看執行計劃,返回的rows爲25,

執行計劃範圍的rows 是25,咱們計算的基數是也是25,可是實際返回的結果集了498,相差很大,所以不能按照以往的方式來計算基數。可見,CBO也是按照原來的計算方式來評估,致使執行計劃誤差很大。

下面使用動態採樣方式來幫助CBO選擇正確的執行計劃:

SQL> select /*+dynamic_sampling(t2,2)*/ count(1) from t2 where n1=3 and n2=3 and c1='a';

 

COUNT(1)

----------

498

 

SQL> select * from table(dbms_xplan.display_cursor(null,0));

 

PLAN_TABLE_OUTPUT

------------------------------------------------------------------------------------------------------------------------------------------------------

SQL_ID cq6v06fah22r8, child number 0

-------------------------------------

select /*+dynamic_sampling(t2,2)*/ count(1) from t2 where n1=3 and n2=3

and c1='a'

 

Plan hash value: 4191549303

 

----------------------------------------------------------------------------

| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |

----------------------------------------------------------------------------

| 0 | SELECT STATEMENT | | | | 3 (100)| |

| 1 | SORT AGGREGATE | | 1 | 8 | | |

|* 2 | INDEX RANGE SCAN| IDX_T2 | 498 | 3984 | 3 (0)| 00:00:01 |

----------------------------------------------------------------------------

 

Predicate Information (identified by operation id):

---------------------------------------------------

 

2 - access("N1"=3 AND "N2"=3 AND "C1"='a')

 

Note

-----

- dynamic sampling used for this statement (level=2)

由上可見,執行計劃返貨的rows爲498跟實際返回結果集一致,所以得出結論:在目標sql謂詞列有關聯關係時,動態採樣能夠正確評估出基數,進而能夠幫助CBO選擇正確的執行計劃。

相關文章
相關標籤/搜索