首先是表mysql
CREATE TABLE `page_test` ( `id` INT(11) NOT NULL AUTO_INCREMENT, `name` VARCHAR(20) NOT NULL, `email` VARCHAR(40) NOT NULL, `solved_number` INT(11) NOT NULL, PRIMARY KEY (`id`) ) COLLATE='utf8_general_ci' ENGINE=InnoDB AUTO_INCREMENT=1 ;
搞1000086個數c++
delimiter $$ create procedure pre() BEGIN declare i INT; SET i = 1; while i < 1000086 DO INSERT INTO page_test VALUES (i,substring(MD5(RAND()),1,20),substring(MD5(RAND()),1,20),i); SET i = i+1; END while; END; $$ CALL pre();
win10+i3(存儲介質:eMLC寨盤)插入速度1.8MB/ssql
改用sql直接導入性能
首先要調整大小,否則會gone away優化
set global max_allowed_packet=1068435456;
(一個不許確但夠大的數)spa
C++生成3d
#include<bits/stdc++.h> using namespace std; const int MAXN = 5e6+11; char rnd[13]; char rnd2[13]; int main() { freopen("insert.txt","w",stdout); int cur = 0; printf("INSERT INTO training.page_test\nVALUES"); while(cur++ < MAXN) { for(int i = 0; i < 12; i++) { rnd[i] = (rand()%26)+'a'; rnd2[i] = (rand()%10)+'0'; } printf("(%d,'%s','%s',%d)",cur,rnd,rnd2,cur); if(cur < MAXN) printf(",\n"); } return 0; }
mysql -uroot -p123456 < D:\Code\cpp\insert.txt
IO大概在60-130MB/s(內存佔了2-3G)code
SELECT COUNT(*) FROM page_test; /* Affected rows: 0 Found rows: 1 Warnings: 0 Duration for 1 query: 1.016 sec. */ SELECT * FROM page_test LIMIT 5000002,1; /* Affected rows: 0 Found rows: 1 Warnings: 0 Duration for 1 query: 3.031 sec. */ SELECT * FROM page_test LIMIT 5000002,3; /* Affected rows: 0 Found rows: 3 Warnings: 0 Duration for 1 query: 3.110 sec. */
走索引blog
SELECT * FROM page_test WHERE id = ( SELECT id FROM page_test LIMIT 5000003,1 ); /* Affected rows: 0 Found rows: 1 Warnings: 0 Duration for 1 query: 2.219 sec. */
多個id排序
SELECT a.* FROM page_test a JOIN (SELECT id FROM page_test LIMIT 5000001,5) b ON a.id = b.id; /* Affected rows: 0 Found rows: 5 Warnings: 0 Duration for 1 query: 2.219 sec. */
若是已知id必然在某個範圍能夠這樣
SELECT * FROM page_test a WHERE a.id >= 5000002 AND a.id <= 5000006; /* Affected rows: 0 Found rows: 5 Warnings: 0 Duration for 1 query: 0.000 sec. */
EXPALIN分析
索引下的LIMIT
EXPLAIN SELECT a.id FROM page_test a LIMIT 5000001,5;
大概須要2.2s
非索引下的LIMIT
EXPLAIN SELECT a.email FROM page_test a LIMIT 5000001,5;
差很少時間2.3s,雖然ALL看着比index要差點,但實際跑起來沒差
事實上有無索引在Limit下跑起來速度幾乎同樣,我的推測與Innodb的索引文件和數據文件合併有關
也就是說,LIMIT的噩夢靠索引救不了
至於前面的先拿出索引id再join的作法,explain相對沒那麼難看且可靈活應對修改,但事實上跑起來也。。。沒差拉
真正的索引與虛僞的索引
EXPLAIN SELECT * FROM page_test a WHERE a.id >= 5000002 AND a.id <= 5000006;
只需5行,0s
因此若是沒有刪除的需求且保證ID連續的話,用WHERE來代替分頁是最好的選擇
(各大OJ都使用VOL來表示分頁,看來不是沒有道理)
後記
1.嘗試開啓query_cache,發現靈活性不夠好,算了
2.看到有人用索引+order by來優化後面的limit查詢,感受有點意思
試了一下
SELECT * FROM page_test ORDER BY id DESC LIMIT 5;
時間不用看了,秒出結果(DESC不須要真正的排序 (Backward index scan))
加上這種優化能夠保證最壞狀況只在n/2出現,很是實用(大多數人都是要麼看前面要麼看後面)
固然order by要保證索引
3.用派生的表/列來維護頁數
要是願意用O(n)的時間來維護每一次增刪,卻是個不錯的方法
可是這樣表的數量直接X2
而且對於百/千萬級以上的數據,O(n)須要花費秒級的代價來維護
4.子查詢在個人mySQL版本不容許使用in (select....),我感受即便能用也差很少
5.理論上limit m,n能夠經過B+樹上維護子樹大小來logm查找,可能考慮到SQL的複雜性,事實上根本沒有這種操做
6.考慮需求
要是確實要刪除,直接屏蔽訪問而不完全刪除也是一種策略,這樣依然能夠保證ID連續意義下分頁查詢的高性能
保證上面前提下的修改,某種狀況下能夠用交換來代替實現(ID惟一保證),這樣分頁依然能夠快速查詢