CREATE TABLE `ibd2_test` ( `id` int(11) NOT NULL, `name` varchar(20) NOT NULL ) ENGINE=InnoDB DEFAULT CHARSET=utf8
+----+-------+ | id | name | +----+-------+ | 1 | test1 | | 2 | test2 | | 3 | test3 | | 4 | test4 | | 5 | test5 | +----+-------+ 5 rows in set (0.00 sec)
以後delete id爲3的行,並繼續插入4行數據,最終:css
localhost.test>select * from ibd2_test; +----+-------+ | id | name | +----+-------+ | 1 | test1 | | 2 | test2 | | 4 | test4 | | 5 | test5 | | 6 | test6 | | 7 | test7 | | 8 | test8 | | 9 | test9 | +----+-------+ 8 rows in set (0.00 sec)
本身python寫的Innodb Extracthtml
首先回憶下MySQL源碼中關於record格式的定義,文件rec0rem.c(77~104行)node
/* PHYSICAL RECORD (NEW STYLE)
===========================pythonThe physical record, which is the data type of all the records
found in index pages of the database, has the following format
(lower addresses and more significant bits inside a byte are below
represented on a higher text line):mysql| length of the last non-null variable-length field of data:
if the maximum length is 255, one byte; otherwise,
0xxxxxxx (one byte, length=0..127), or 1exxxxxxxxxxxxxx (two bytes,
length=128..16383, extern storage flag) |
...
| length of first variable-length field of data |
| SQL-null flags (1 bit per nullable field), padded to full bytes |
| 4 bits used to delete mark a record, and mark a predefined
minimum record in alphabetical order |
| 4 bits giving the number of records owned by this record
(this term is explained in page0page.h) |
| 13 bits giving the order number of this record in the
heap of the index page |
| 3 bits record type: 000=conventional, 001=node pointer (inside B-tree),
010=infimum, 011=supremum, 1xx=reserved |
| two bytes giving a relative pointer to the next record in the page |
ORIGIN of the record
| first field of data |
...
| last field of data |nginx
畫成圖以下:
git
info bits
的第三位表示該行是否已被刪除,若是是則標記1,沒有被刪除則標記0,第四位表示該記錄是不是預先被定義爲最小的記錄,若是是則標記爲1
n_owned
該記錄擁有的記錄數,指的是該記錄所在頁中page diectory所屬slot中擁有的記錄數
order
索引堆中的順序,僞記錄首記錄infimum這裏爲0,而僞記錄最後一條記錄spremum這裏爲1,也就是說真實記錄從2開始。這裏這個值表明的是物理記錄的真實順序,而非邏輯順序,後續咱們爲此驗證
record type
表示記錄的類型,數據行爲0,節點指針值爲1,僞記錄首記錄infimum值爲2,僞記錄最後一個記錄supremum的值爲3
next record offset
下一條記錄的相對offset,經過這個next record offset 咱們能夠遍歷一個頁中的全部記錄。記錄與記錄之間經過鏈表的形式組織github
step 1,咱們首先看下原先刪除Id爲3的記錄前:web
[root@hebe211 ibd]# python innodb_extract.py ibd2_test.ibd infimum row_id:000000000213,info_bits:0000,n_owned:0000,order:2(0000000000010),next offset:34(0000000000100010) 1 test1 row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:34(0000000000100010) 2 test2 row_id:000000000215,info_bits:0000,n_owned:0000,order:4(0000000000100),next offset:34(0000000000100010) 3 test3 row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010) 4 test4 row_id:000000000217,info_bits:0000,n_owned:0000,order:6(0000000000110),next offset:-150(1111111101101010) 5 test5
首先,咱們沒有定義主鍵,因此係統會自動建立一個6字節的row_id做爲隱藏主鍵,每一條記錄record header的最後兩個字節指向下一條記錄row_id的起始offset,鏈表是按照聚簇索引組織起來的,也就說邏輯記錄是按照聚簇索引的順序連接起來。咱們在看物理順序是2->3->4->5->6,此時跟聚簇索引的順序是徹底同樣的!(另外在個人工具中把僞記錄的首記錄infimum和尾記錄supremum過濾了,這兩條記錄的order分別是0和1,這裏不作詳。)sql
step 2,咱們將id爲3(row_id爲000000000215
)的記錄刪除,再看變化
infimum
row_id:000000000213,info_bits:0000,n_owned:0000,order:2(0000000000010),next offset:34(0000000000100010)
1 test1
row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:68(0000000001000100)
2 test2
row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010)
4 test4
row_id:000000000217,info_bits:0000,n_owned:0000,order:6(0000000000110),next offset:-150(1111111101101010)
5 test5
咱們看到,row_id爲000000000215
的記錄不見了,就是說在這個數據鏈表中被摘除了。此時記錄的物理順序也沒有變:2->3->5->6,第二行row_id爲000000000214
的下一條記錄的offset再也不是34,而變成了68,指向的是row_id爲000000000216
的行。印證了前一句我說的id爲3的記錄是被從數據鏈表中'摘除'而不是刪除。
step 3,咱們繼續插入4條數據以後再看
infimum row_id:000000000213,info_bits:0000,n_owned:0000,order:2(0000000000010),next offset:34(0000000000100010) 1 test1 row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:68(0000000001000100) 2 test2 row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010) 4 test4 row_id:000000000217,info_bits:0000,n_owned:0100,order:6(0000000000110),next offset:-68(1111111110111100) 5 test5 row_id:000000000218,info_bits:0000,n_owned:0000,order:4(0000000000100),next offset:102(0000000001100110) 6 test6 row_id:000000000219,info_bits:0000,n_owned:0000,order:7(0000000000111),next offset:34(0000000000100010) 7 test7 row_id:00000000021a,info_bits:0000,n_owned:0000,order:8(0000000001000),next offset:34(0000000000100010) 8 test8 row_id:00000000021b,info_bits:0000,n_owned:0000,order:9(0000000001001),next offset:-252(1111111100000100) 9 test9
此時數據鏈表中的物理順序變爲2->3->5->6->4->7->8->9,注意物理存儲的順序再也不是根據聚簇索引順序排序的順序了!咱們後插入的第一條row_id爲000000000218
的記錄此時在堆中的排序變成4,同時row_id爲000000000217
的下一條記錄的相對位置offset偏移量變成了負數(負數的存儲方式以補碼的形式存儲),而且-68就是剛剛被刪除的row_id爲000000000215
的物理偏移量,那咱們能夠理解爲被刪除的空間重用了
step 4,咱們再刪除1條id爲8(row_id00000000021a
)的行
localhost.test>select * from ibd2_test; +----+-------+ | id | name | +----+-------+ | 1 | test1 | | 2 | test2 | | 4 | test4 | | 5 | test5 | | 6 | test6 | | 7 | test7 | | 9 | test9 | +----+-------+
而後咱們再觀察,根據mysql源碼裏對於PAGE HEADER的定義:
/* PAGE HEADER =========== Index page header starts at the first offset left free by the FIL-module */ typedef byte page_header_t; #define PAGE_HEADER FSEG_PAGE_DATA /* index page header starts at this offset */ /*-----------------------------*/ #define PAGE_N_DIR_SLOTS 0 /* number of slots in page directory */ #define PAGE_HEAP_TOP 2 /* pointer to record heap top */ #define PAGE_N_HEAP 4 /* number of records in the heap, bit 15=flag: new-style compact page format */ #define PAGE_FREE 6 /* pointer to start of page free record list */ #define PAGE_GARBAGE 8 /* number of bytes in deleted records */
PAGE_FREE和PAGE_GARBAGE分別定義可重用空間的指針和可重用空間的大小,咱們打開debug信息,再看下物理行的變化
[root@hebe211 ibd]# python innodb_extract.py ibd_test.ibd PAGE_FREE pointer offset 330,PAGE_GARBAGE size 34 now row begin offset 99 infimum now row begin offset 126 row_id:000000000213,info_bits:0000,n_owned:0000,order:2(0000000000010),next offset:34(0000000000100010) 1 test1 now row begin offset 160 row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:68(0000000001000100) 2 test2 now row begin offset 228 row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010) 4 test4 now row begin offset 262 row_id:000000000217,info_bits:0000,n_owned:0100,order:6(0000000000110),next offset:-68(1111111110111100) 5 test5 now row begin offset 194 row_id:000000000218,info_bits:0000,n_owned:0000,order:4(0000000000100),next offset:102(0000000001100110) 6 test6 now row begin offset 296 row_id:000000000219,info_bits:0000,n_owned:0000,order:7(0000000000111),next offset:68(0000000001000100) 7 test7 now row begin offset 364 row_id:00000000021b,info_bits:0000,n_owned:0000,order:9(0000000001001),next offset:-252(1111111100000100) 9 test9
此時row_id爲000000000219
的下一行指向了row_id00000000021b
,相對offset從34變爲了68,跳過了剛纔刪除的row_id爲00000000021a
的行。此時在看PAGE_FREE指向的offset爲330,PAGE_GARBAGE大小34個字節,等於row_id000000000219
起始offset 296 + 34(剛纔刪除行的size),也就是說剛纔從數據鏈表被摘下的行被放入了可重用空間鏈表裏去了,這個指針永遠指向最新的被刪除的行,若是有數據插入,這個可重用空間被重用,那麼這行就從可重用空間鏈表裏摘除,同時放入數據鏈表中
step 5 爲了印證上面的想法,咱們繼續刪除id爲1(row_id爲000000000213
)的行
localhost.test>select * from ibd2_test; +----+-------+ | id | name | +----+-------+ | 2 | test2 | | 4 | test4 | | 5 | test5 | | 6 | test6 | | 7 | test7 | | 9 | test9 | +----+-------+ 6 rows in set (0.00 sec)
咱們在看下可重用空間指針內容的變化
[root@hebe211 ibd]# python innodb_extract.py ibd2_test.ibd PAGE_FREE pointer offset 126,PAGE_GARBAGE size 68 now row begin offset 99 infimum now row begin offset 160 row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:68(0000000001000100) 2 test2 now row begin offset 228 row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010) 4 test4 now row begin offset 262 row_id:000000000217,info_bits:0000,n_owned:0000,order:6(0000000000110),next offset:-68(1111111110111100) 5 test5 now row begin offset 194 row_id:000000000218,info_bits:0000,n_owned:0000,order:4(0000000000100),next offset:102(0000000001100110) 6 test6 now row begin offset 296 row_id:000000000219,info_bits:0000,n_owned:0000,order:7(0000000000111),next offset:68(0000000001000100) 7 test7 now row begin offset 364 row_id:00000000021b,info_bits:0000,n_owned:0000,order:9(0000000001001),next offset:-252(1111111100000100) 9 test9
刪除id爲1的行以後,此時PAGE_FREE指針指向了位置爲126的位置,此時可重用空間的大小變成了68字節。而此時僞記錄的首記錄infimum的下一條記錄的指針指向了row_id爲000000000214
的行,而再也不是row_id 000000000213
的行,offset變爲68,跳過了被刪除的行。此時,咱們看下,PAGE_FREE指向的offset爲126,正是被刪除的行(row_id爲000000000213
,offset爲126)的起始位置,而可重用空間的大小從34字節變成了64字節。說明PAGE_FREE指針指向的是最新的被刪除的行,而有新數據插入的時候,也是重用最後刪除的行的空間,符合「後入先出」規律,相似於棧。
step 6,咱們最後插入一條數據,看是否會重用row_id000000000213
的行的空間,若是是的話,變驗證了上面的想法
localhost.test>select * from ibd2_test; +----+-------+ | id | name | +----+-------+ | 2 | test2 | | 4 | test4 | | 5 | test5 | | 6 | test6 | | 7 | test7 | | 9 | test9 | | 3 | testa | +----+-------+ 7 rows in set (0.00 sec)
[root@hebe211 ibd]# python innodb_extract.py ibd2_test.ibd PAGE_FREE pointer offset 330,PAGE_GARBAGE size 34 now row begin offset 99 infimum now row begin offset 160 row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:68(0000000001000100) 2 test2 now row begin offset 228 row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010) 4 test4 now row begin offset 262 row_id:000000000217,info_bits:0000,n_owned:0000,order:6(0000000000110),next offset:-68(1111111110111100) 5 test5 now row begin offset 194 row_id:000000000218,info_bits:0000,n_owned:0000,order:4(0000000000100),next offset:102(0000000001100110) 6 test6 now row begin offset 296 row_id:000000000219,info_bits:0000,n_owned:0000,order:7(0000000000111),next offset:68(0000000001000100) 7 test7 now row begin offset 364 row_id:00000000021b,info_bits:0000,n_owned:0000,order:9(0000000001001),next offset:-238(1111111100010010) 9 test9 now row begin offset 126 row_id:00000000021c,info_bits:0000,n_owned:0000,order:2(0000000000010),next offset:-14(1111111111110010) 3 testa
咱們看到插入id=3(row_id00000000021c
)的行以後,PAGE_FREE指向的offset從126變回了330,可重用空間大小也變成了34字節,最新刪除的行的空間從刪除鏈中摘除,同時咱們看到新插入的行order爲2,也就是以前的刪除的id=1(row_id000000000213
)佔用的空間,空間此處被新插入數據重用。
step5 到step6刪除鏈表的變化總結如圖:
最後,咱們打開debug信息,分析一下如今刪除鏈表存儲的內容
[root@hebe211 ibd]# python innodb_extract.py ibd2_test.ibd PAGE_FREE pointer offset 330,PAGE_GARBAGE size 34 row_id:00000000021a,info_bits:0010,n_owned:0000,order:8(0000000001000),next offset:0(0000000000000000) now row begin offset 99 infimum now row begin offset 160 row_id:000000000214,info_bits:0000,n_owned:0000,order:3(0000000000011),next offset:68(0000000001000100) 2 test2 now row begin offset 228 row_id:000000000216,info_bits:0000,n_owned:0000,order:5(0000000000101),next offset:34(0000000000100010) 4 test4 now row begin offset 262 row_id:000000000217,info_bits:0000,n_owned:0000,order:6(0000000000110),next offset:-68(1111111110111100) 5 test5 now row begin offset 194 row_id:000000000218,info_bits:0000,n_owned:0000,order:4(0000000000100),next offset:102(0000000001100110) 6 test6 now row begin offset 296 row_id:000000000219,info_bits:0000,n_owned:0000,order:7(0000000000111),next offset:68(0000000001000100) 7 test7 now row begin offset 364 row_id:00000000021b,info_bits:0000,n_owned:0000,order:9(0000000001001),next offset:-238(1111111100010010) 9 test9 now row begin offset 126 row_id:00000000021c,info_bits:0000,n_owned:0000,order:2(0000000000010),next offset:-14(1111111111110010) 3 testa
row_id:00000000021a,info_bits:0010,n_owned:0000,order:8(0000000001000),next offset:0(0000000000000000)
now row begin offset 99
row_id00000000021a
就是以前刪除的Id=8的記錄
重點是這個info_bits:0010,第三位是deleted標誌位,爲1說明該行記錄已被刪除
由於刪除鏈只有這一條數據,因此next offset指向的下一條記錄offset爲0
經過以上record header結合物理存儲格式,咱們看到有3個鏈表:邏輯記錄,物理記錄,刪除記錄