SQL性能優化的側重點之一就是用小表驅動大表,今天我作了一次實驗。sql
硬件環境:T440p數據庫
軟件環境:Win10性能優化
實驗數據庫:Oracle Database 11g Enterprise Edition Release 11.2.0.1.0 - 64bit Productiondom
實驗表:bigtable含id,name,score,createtime,有百萬數據;smalltable含id,name,createtime三個字段,有十萬數據。ide
需求:先從小表中查出姓名以張開頭的姓名,再與大表進行半鏈接。這樣是爲了模擬《SQL優化核心思想》(羅炳森黃超等著)P187的內容。不然用select * from bigtable where name like '張%'就行了。性能
目的:對比常規方案和優化後的小表驅動大表方案的耗時;大數據
常規方案:優化
select * from bigtable where name in (select name from smalltable where name like '張%')
小表驅動大表方案:this
select /*+ leading(big@small) use_nl(big@small,big */ * from bigtable big
where name in (select /*+ qb_name(small) */ name from smalltable where name like '張%')
比較結果:orm
# | 常規方案耗時 | 小表驅動大表方案耗時 |
1 | 00: 00: 04.81 | 00: 00: 03.54 |
2 | 00: 00: 04.04 | 00: 00: 03.50 |
3 | 00: 00: 04.06 | 00: 00: 03.47 |
能夠看去小表驅動大表的確實是快些,但優點仍是不明顯,下面讓我把大表擴大些。
我使用 如下SQL將bigtable的記錄數擴成了兩百萬,這已是我可憐的T440p能接受一次性插入的最大數據量了。
Insert into bigtable select rownum,dbms_random.string('*',dbms_random.value(6,20)),dbms_random.value(0,20),sysdate from dual connect by level<=2000000 order by dbms_random.random
比較結果以下:
# | 常規方案耗時 | 小表驅動大表方案耗時 |
1 | 00: 00: 07.29 | 00: 00: 06.75 |
2 | 00: 00: 07.80 | 00: 00: 06.81 |
3 | 00: 00: 07.37 | 00: 00: 06.87 |
優點更明顯些了。
第二次實驗再次確認了小表驅動大表的優點。
第三次實驗
限於bigtable總量的限制,這回我把大表裏含張姓的數據調多了,具體看看:
SQL> select count(*) from smalltable where name like '張%'; COUNT(*) ---------- 212 已用時間: 00: 00: 00.03 SQL> select count(*) from bigtable where name like '張%'; COUNT(*) ---------- 48330 已用時間: 00: 00: 00.12
小表張姓數據是兩百來條,大表是四萬八千條。
再比三次:
# | 常規方案 | 小表驅動大表方案 |
1 | 00: 01: 16.40 | 00: 01: 12.54 |
2 | 00: 01: 17.14 | 00: 01: 15.60 |
3 | 00: 01: 15.73 | 00: 01: 15.30 |
這把隨着數據量的增多,優點反而愈來愈微弱了!!
突然想起小表驅動大表後面還有一句,大表走索引,再給大表的name字段加上索引試試:
SQL> create index bigtable_name_index on bigtable(name); 索引已建立。 已用時間: 00: 00: 06.02
再比三次:
# | 常規方案 | 小表驅動大表方案 |
1 | 00: 01: 15.62 | 00: 01: 13.59 |
2 | 00: 01: 16.27 | 00: 01: 16.24 |
3 | 00: 01: 17.50 | 00: 01: 16.58 |
讓人暈菜的結果出來了,原覺得加了索引兩個方案都快了,並且驅動比常規方案優點明顯,結果都沒有。估計數據庫仍是走的全表查詢。
讓咱們看看select * from bigtable where name in (select name from smalltable where name like '張%')的執行計劃:
SQL> set autotrace trace exp SQL> select * from bigtable where name in (select name from smalltable where name like '張%'); 已用時間: 00: 00: 00.00 執行計劃 ---------------------------------------------------------- Plan hash value: 4026363122 ---------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 311 | 50693 | 446 (1)| 00:00:06 | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 311 | 50693 | 446 (1)| 00:00:06 | | 3 | SORT UNIQUE | | 178 | 11036 | 172 (2)| 00:00:03 | |* 4 | TABLE ACCESS FULL | SMALLTABLE | 178 | 11036 | 172 (2)| 00:00:03 | |* 5 | INDEX RANGE SCAN | BIGTABLE_NAME_INDEX | 3 | | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| BIGTABLE | 2 | 202 | 5 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("NAME" LIKE U'\5F20%') 5 - access("NAME"="NAME") filter("NAME" LIKE U'\5F20%') Note ----- - dynamic sampling used for this statement (level=2)
select /*+ leading(big@small) use_nl(big@small,big */ * from bigtable big where name in (select /*+ qb_name(small) */ name from smalltable where name like '張%')的執行計劃:
SQL> select /*+ leading(big@small) use_nl(big@small,big */ * from bigtable big
where name in (select /*+ qb_name(small) */ name from smalltable where name like '張%'); 已用時間: 00: 00: 00.00 執行計劃 ---------------------------------------------------------- Plan hash value: 4026363122 ---------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 311 | 50693 | 446 (1)| 00:00:06 | | 1 | NESTED LOOPS | | | | | | | 2 | NESTED LOOPS | | 311 | 50693 | 446 (1)| 00:00:06 | | 3 | SORT UNIQUE | | 178 | 11036 | 172 (2)| 00:00:03 | |* 4 | TABLE ACCESS FULL | SMALLTABLE | 178 | 11036 | 172 (2)| 00:00:03 | |* 5 | INDEX RANGE SCAN | BIGTABLE_NAME_INDEX | 3 | | 2 (0)| 00:00:01 | | 6 | TABLE ACCESS BY INDEX ROWID| BIGTABLE | 2 | 202 | 5 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("NAME" LIKE U'\5F20%') 5 - access("NAME"="NAME") filter("NAME" LIKE U'\5F20%') Note ----- - dynamic sampling used for this statement (level=2)
最後咱們發現執行計劃是近乎一致的,看來是Oracle內部將普通方案進行優化的結果,優化的方向就是小表驅動大表。
若是我強制讓大表驅動,那麼差異不就出來了?
SQL> select /*+ leading(big) */ * from bigtable big where name in (select /*+ qb_name(small) */ name from smalltable where name like '張%'); 已用時間: 00: 00: 00.01 執行計劃 ---------------------------------------------------------- Plan hash value: 3046684681 ----------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 312 | 50856 | | 4297 (2)| 00:00:52 | |* 1 | HASH JOIN SEMI | | 312 | 50856 | 4920K| 4297 (2)| 00:00:52 | |* 2 | TABLE ACCESS FULL| BIGTABLE | 44556 | 4394K| | 3883 (2)| 00:00:47 | |* 3 | TABLE ACCESS FULL| SMALLTABLE | 178 | 11036 | | 172 (2)| 00:00:03 | ----------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("NAME"="NAME") 2 - filter("NAME" LIKE U'\5F20%') 3 - filter("NAME" LIKE U'\5F20%') Note ----- - dynamic sampling used for this statement (level=2)
這一步oralce讓大表先篩選一遍,再是小表篩選一遍,二者再作哈希半鏈接。
看這把它要執行多少時間:
已選擇48330行。 已用時間: 00: 00: 09.75
這比小表驅動大表還要快!
再看看終極 sql要跑多久:
select * from bigtable where name like '張%'
已選擇48330行。 已用時間: 00: 00: 09.94
看來數據不夠複雜,Oracle內部優化器輕易找到了最優方案,此次實驗沒有達到預期效果。
--END-- 2020年1月5日10點28分