在MySQL8.0以前的版本中,因爲架構的緣由,mysql在server層使用統一的frm文件來存儲表元數據信息,這個信息可以被不一樣的存儲引擎識別。而實際上innodb自己也存儲有元數據信息。這給ddl帶來了必定的挑戰,由於這種架構沒法作到ddl的原子化,咱們在線上常常可以看到數據目錄下遺留的臨時文件,或者相似server層和innodb層列個數不一致之類的錯誤。甚至某些ddl可能還遺留元數據在innodb內,而丟失了frm,致使沒法重建表…..(咱們爲了解決這個問題,實現了一個叫drop table force的功能,去強制作清理….)mysql
(如下全部的討論都假定使用InnoDB存儲引擎)sql
到了8.0版本,咱們知道全部的元數據已經統一用InnoDB來進行管理,這就給實現原子ddl帶來了可能,幾乎全部的對innodb表,存儲過程,觸發器,視圖或者UDF的操做,都能作到原子化:數據庫
- 元數據修改,binlog以及innodb的操做都放在一個事務中 - 增長了一個內部隱藏的系統表`mysql.innodb_ddl_log`,ddl操做被記錄到這個表中,注意對該表的操做產生的redo會fsync到磁盤上,而不會考慮innodb_flush_log_at_trx_commit的配置。當崩潰重啓時,會根據事務是否提交來決定經過這張表的記錄去回滾或者執行ddl操做 - 增長了一個post-ddl的階段,這也是ddl的最後一個階段,會去:1. 真正的物理刪除或重命名文件; 2. 刪除innodb_ddl_log中的記錄項; 3.對於一些ddl操做還會去更新其動態元數據信息(存儲在`mysql.innodb_dynamic_metadata`,例如corrupt flag, auto_inc值等) - 一個正常運行的ddl結束後,其ddl log也應該被清理,若是這中間崩潰了,重啓時會去嘗試重放:1.若是已經走到最後一個ddl階段的(commit以後),就replay ddl log,把ddl完成掉;2. 若是處於某個中間態,則回滾ddl
因爲引入了atomic ddl, 有些ddl操做的行爲也發生了變化:數組
- DROP TABLE: 在以前的版本中,一個drop table語句中若是要刪多個表,好比t1,t2, t2不存在時,t1會被刪除。但在8.0中,t1和t2都不會被刪除,而是拋出錯誤。所以要注意5.7->8.0的複製問題 (DROP VIEW, CREATE USER也有相似的問題) - DROP DATABASE: 修改元數據和ddl_log先提交事務,而真正的物理刪除數據文件放在最後,所以若是在刪除文件時崩潰,重啓時會根據ddl_log繼續執行drop database
MySQL很貼心的加了一個選項innodb_print_ddl_logs
,打開後咱們能夠從錯誤日誌看到對應的ddl log,下面咱們經過這個來看下一些典型ddl的過程架構
root@(none) 11:12:19>SET GLOBAL innodb_print_ddl_logs = 1; Query OK, 0 rows affected (0.00 sec) root@(none) 11:12:22>SET GLOBAL log_error_verbosity = 3; Query OK, 0 rows affected (0.00 sec)
mysql> CREATE DATABASE test; Query OK, 1 row affected (0.02 sec)
建立數據庫語句沒有寫log_ddl,可能以爲這不是高頻操做,若是建立database的過程當中失敗了,重啓後可能須要手動刪除目錄。函數
mysql> USE test; Database changed mysql> CREATE TABLE t1 (a INT PRIMARY KEY, b INT); Query OK, 0 rows affected (0.06 sec) [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=428, thread_id=7, space_id=76, old_file_path=./test/t1.ibd] [InnoDB] DDL log delete : by id 428 [InnoDB] DDL log insert : [DDL record: REMOVE CACHE, id=429, thread_id=7, table_id=1102, new_file_path=test/t1] [InnoDB] DDL log delete : by id 429 [InnoDB] DDL log insert : [DDL record: FREE, id=430, thread_id=7, space_id=76, index_id=190, page_no=4] [InnoDB] DDL log delete : by id 430 [InnoDB] DDL log post ddl : begin for thread id : 7 InnoDB] DDL log post ddl : end for thread id : 7
從日誌來看有三類操做,實際上描述了若是操做失敗須要進行的三項逆向操做:刪除數據文件,釋放內存中的數據詞典信息,刪除索引btree。在建立表以前,這些數據被寫入到ddl_log中,在建立完表並commit後,再從ddl log中刪除這些記錄。
另外上述日誌中還有DDL log delete
日誌,其實在每次寫入ddl log時是單獨事務提交的,但在提交以後,會使用當前事務執行一條delete操做,直到操做結束了纔會提交。post
mysql> ALTER TABLE t1 ADD COLUMN c INT; Query OK, 0 rows affected (0.08 sec) Records: 0 Duplicates: 0 Warnings: 0 [InnoDB] DDL log post ddl : begin for thread id : 7 [InnoDB] DDL log post ddl : end for thread id : 7
注意這裏執行的是Instant ddl, 這是8.0.13新支持的特性,加列操做能夠只修改元數據,所以從ddl log中無需記錄數據測試
mysql> ALTER TABLE t1 DROP COLUMN c; Query OK, 0 rows affected (2.77 sec) Records: 0 Duplicates: 0 Warnings: 0 [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=487, thread_id=7, space_id=83, old_file_path=./test/#sql-ib1108-1917598001.ibd] [InnoDB] DDL log delete : by id 487 [InnoDB] DDL log insert : [DDL record: REMOVE CACHE, id=488, thread_id=7, table_id=1109, new_file_path=test/#sql-ib1108-1917598001] [InnoDB] DDL log delete : by id 488 [InnoDB] DDL log insert : [DDL record: FREE, id=489, thread_id=7, space_id=83, index_id=200, page_no=4] [InnoDB] DDL log delete : by id 489 [InnoDB] DDL log insert : [DDL record: DROP, id=490, thread_id=7, table_id=1108] [InnoDB] DDL log insert : [DDL record: RENAME SPACE, id=491, thread_id=7, space_id=82, old_file_path=./test/#sql-ib1109-1917598002.ibd, new_file_path=./test/t1.ibd] [InnoDB] DDL log delete : by id 491 [InnoDB] DDL log insert : [DDL record: RENAME TABLE, id=492, thread_id=7, table_id=1108, old_file_path=test/#sql-ib1109-1917598002, new_file_path=test/t1] [InnoDB] DDL log delete : by id 492 [InnoDB] DDL log insert : [DDL record: RENAME SPACE, id=493, thread_id=7, space_id=83, old_file_path=./test/t1.ibd, new_file_path=./test/#sql-ib1108-1917598001.ibd] [InnoDB] DDL log delete : by id 493 [InnoDB] DDL log insert : [DDL record: RENAME TABLE, id=494, thread_id=7, table_id=1109, old_file_path=test/t1, new_file_path=test/#sql-ib1108-1917598001] [InnoDB] DDL log delete : by id 494 [InnoDB] DDL log insert : [DDL record: DROP, id=495, thread_id=7, table_id=1108] [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=496, thread_id=7, space_id=82, old_file_path=./test/#sql-ib1109-1917598002.ibd] [InnoDB] DDL log post ddl : begin for thread id : 7 [InnoDB] DDL log replay : [DDL record: DELETE SPACE, id=496, thread_id=7, space_id=82, old_file_path=./test/#sql-ib1109-1917598002.ibd] [InnoDB] DDL log replay : [DDL record: DROP, id=495, thread_id=7, table_id=1108] [InnoDB] DDL log replay : [DDL record: DROP, id=490, thread_id=7, table_id=1108] [InnoDB] DDL log post ddl : end for thread id : 7
這是個典型的三階段ddl的過程:分爲prepare, perform 以及commit三個階段:阿里雲
實際上這一步寫的ddl log描述了commit階段操做的逆向過程:將t1.ibd rename成#sql-ib1109-1917598002, 並將sql-ib1108-1917598001 rename成t1表,最後刪除舊錶。其中刪除舊錶的操做這裏不執行,而是到post-ddl階段執行atom
mysql> ALTER TABLE t1 ADD KEY(b); Query OK, 0 rows affected (0.14 sec) Records: 0 Duplicates: 0 Warnings: 0 [InnoDB] DDL log insert : [DDL record: FREE, id=431, thread_id=7, space_id=76, index_id=191, page_no=5] [InnoDB] DDL log delete : by id 431 [InnoDB] DDL log post ddl : begin for thread id : 7 [InnoDB] DDL log post ddl : end for thread id : 7
建立索引採用inplace建立的方式,沒有臨時文件,但若是異常發生的話,依然須要在發生異常時清理臨時索引, 所以增長了一條FREE log,用於異常發生時可以刪除臨時索引.
mysql> TRUNCATE TABLE t1; Query OK, 0 rows affected (0.13 sec) [InnoDB] DDL log insert : [DDL record: RENAME SPACE, id=439, thread_id=7, space_id=77, old_file_path=./test/#sql-ib1103-1917597994.ibd, new_file_path=./test/t1.ibd] [InnoDB] DDL log delete : by id 439 [InnoDB] DDL log insert : [DDL record: DROP, id=440, thread_id=7, table_id=1103] [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=441, thread_id=7, space_id=77, old_file_path=./test/#sql-ib1103-1917597994.ibd] [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=442, thread_id=7, space_id=78, old_file_path=./test/t1.ibd] [InnoDB] DDL log delete : by id 442 [InnoDB] DDL log insert : [DDL record: REMOVE CACHE, id=443, thread_id=7, table_id=1104, new_file_path=test/t1] [InnoDB] DDL log delete : by id 443 [InnoDB] DDL log insert : [DDL record: FREE, id=444, thread_id=7, space_id=78, index_id=194, page_no=4] [InnoDB] DDL log delete : by id 444 [InnoDB] DDL log insert : [DDL record: FREE, id=445, thread_id=7, space_id=78, index_id=195, page_no=5] [InnoDB] DDL log delete : by id 445 [InnoDB] DDL log post ddl : begin for thread id : 7 [InnoDB] DDL log replay : [DDL record: DELETE SPACE, id=441, thread_id=7, space_id=77, old_file_path=./test/#sql-ib1103-1917597994.ibd] [InnoDB] DDL log replay : [DDL record: DROP, id=440, thread_id=7, table_id=1103] [InnoDB] DDL log post ddl : end for thread id : 7
Truncate table是個比較有意思的話題,在早期5.6及以前的版本中, 是經過刪除舊錶建立新表的方式來進行的,5.7以後爲了保證原子性,改爲了原地truncate文件,同時增長了一個truncate log文件,若是在truncate過程當中崩潰,能夠經過這個文件在崩潰恢復時從新truncate。到了8.0版本,又恢復成了刪除舊錶,建立新表的方式,與以前不一樣的是,8.0版本在崩潰時能夠回滾到舊數據,而不是再次執行。以上述爲例,主要包括幾個步驟:
mysql> RENAME TABLE t1 TO t2; Query OK, 0 rows affected (0.06 sec)
DDL LOG:
[InnoDB] DDL log insert : [DDL record: RENAME SPACE, id=450, thread_id=7, space_id=78, old_file_path=./test/t2.ibd, new_file_path=./test/t1.ibd] [InnoDB] DDL log delete : by id 450 [InnoDB] DDL log insert : [DDL record: RENAME TABLE, id=451, thread_id=7, table_id=1104, old_file_path=test/t2, new_file_path=test/t1] [InnoDB] DDL log delete : by id 451 [InnoDB] DDL log post ddl : begin for thread id : 7 [InnoDB] DDL log post ddl : end for thread id : 7
這個就比較簡單了,只須要記錄rename space 和rename table的逆操做便可. post-ddl不須要作實際的操做
DROP TABLE t2 [InnoDB] DDL log insert : [DDL record: DROP, id=595, thread_id=7, table_id=1119] [InnoDB] DDL log insert : [DDL record: DELETE SPACE, id=596, thread_id=7, space_id=93, old_file_path=./test/t2.ibd] [InnoDB] DDL log post ddl : begin for thread id : 7 [InnoDB] DDL log replay : [DDL record: DELETE SPACE, id=596, thread_id=7, space_id=93, old_file_path=./test/t2.ibd] [InnoDB] DDL log replay : [DDL record: DROP, id=595, thread_id=7, table_id=1119] [InnoDB] DDL log post ddl : end for thread id : 7
先在ddl log中記錄下須要刪除的數據,再提交後,再最後post-ddl階段執行真正的刪除表對象和文件操做
主要實現代碼集中在文件storage/innobase/log/http://log0ddl.cc中,包含了向log_ddl表中插入記錄以及replay的邏輯。
隱藏的innodb_log_ddl表結構以下
def->add_field(0, "id", "id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT"); def->add_field(1, "thread_id", "thread_id BIGINT UNSIGNED NOT NULL"); def->add_field(2, "type", "type INT UNSIGNED NOT NULL"); def->add_field(3, "space_id", "space_id INT UNSIGNED"); def->add_field(4, "page_no", "page_no INT UNSIGNED"); def->add_field(5, "index_id", "index_id BIGINT UNSIGNED"); def->add_field(6, "table_id", "table_id BIGINT UNSIGNED"); def->add_field(7, "old_file_path", "old_file_path VARCHAR(512) COLLATE UTF8_BIN"); def->add_field(8, "new_file_path", "new_file_path VARCHAR(512) COLLATE UTF8_BIN"); def->add_index(0, "index_pk", "PRIMARY KEY(id)"); def->add_index(1, "index_k_thread_id", "KEY(thread_id)");
根據不一樣的操做類型,能夠分爲以下幾類:
log_DDL::write_free_tree_log
,在建立索引和刪除表時會調用到對於drop table中涉及的刪索引操做,log ddl的插入操做放到父事務中,一塊兒要麼提交要麼回滾
對於建立索引的case, log ddl就須要單獨提交,父事務將記錄標記刪除,這樣後面若是ddl回滾了,也能將殘留的index刪掉。
入口函數:Log_DDL::write_delete_space_log
用於記錄刪除tablespace操做,一樣分爲兩種狀況:
入口函數:Log_DDL::write_rename_space_log
用於記錄rename操做,例如若是咱們把表t1 rename成t2,在其中就記錄了逆向操做t2 rename to t1.
在函數Fil_shard::space_rename()
中,老是先寫ddl log, 再作真正的rename操做. 寫日誌的過程一樣是獨立事務提交,父事務作未提交的刪除操做
入口函數: Log_DDL::write_drop_log
用於記錄刪除表對象操做,這裏不涉及文件層操做,寫ddl log在父事務中執行
入口函數: Log_DDL::write_rename_table_log
用於記錄rename table對象的逆操做,和rename space相似,也是獨立事務提交ddl log, 父事務標記刪除
入口函數: Log_DDL::write_remove_cache_log
用於處理內存表對象的清理,獨立事務提交,父事務標記刪除
入口函數: Log_DDL::write_alter_encrypt_space_log
用於記錄對tablespace加密屬性的修改,獨立事務提交. 在寫完ddl log後修改tablespace page0 中的加密標記
綜上,在ddl的過程當中可能會提交屢次事務,大概分爲三類:
如上所述,有些ddl log是隨着父事務一塊兒提交的,有些則在post-ddl階段再執行, post_ddl發生在父事提交或回滾以後: 若事務回滾,根據ddl log作逆操做,若事務提交,在post-ddl階段作最後真正不可逆操做(例如刪除文件)
入口函數: Log_DDL::post_ddl -->Log_DDL::replay_by_thread_id
根據執行ddl的線程thread id經過innodb_log_ddl表上的二級索引,找到log id,再到彙集索引上找到其對應的記錄項,而後再replay這些操做,完成ddl後,清理對應記錄
在崩潰恢復結束後,會調用ha_post_recover
接口函數,進而調用innodb內的函數Log_DDL::recover()
, 一樣的replay其中的記錄,並在結束後刪除記錄。但ALTER_ENCRYPT_TABLESPACE_LOG類型並非在這一步刪除,而是加入到一個數組ts_encrypt_ddl_records中,在以後調用resume_alter_encrypt_tablespace
來恢復操做
#阿里雲開年Hi購季#幸運抽好禮!
點此抽獎:【阿里雲】開年Hi購季,幸運抽好禮
本文爲雲棲社區原創內容,未經容許不得轉載。