【案例】利用innodb_force_recovery 解決MySQL服務器crash沒法重啓問題

時間 2019-12-07

標籤案例利用 innodb force recovery 解決 mysql 服務器 crash 沒法重啓問題欄目 MySQL 简体版

原文原文鏈接

一背景
某一創業的朋友的主機由於磁盤陣列損壞機器crash,重啓MySQL服務時報以下錯誤:css

InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 9120034833
150125 16:12:51 InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 150125 16:12:51 [ERROR] mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
To report this bug, see http://kb.askmonty.org/en/reporting-bugs
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.
Server version: 5.5.37-MariaDB-log
key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=0
max_threads=1002
thread_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2332093 K bytes of memory
41 Hope that.

二分析
  主要關注 mysqld got signal 11 的問題,從日誌內容分析來看,數據庫在機器crash 致使日誌文件損壞,重啓以後沒法正常恢復,更沒法正常對外提供服務。

三解決
  由於日誌已經損壞，這裏採用很是規手段,首先修改innodb_force_recovery參數，使mysqld跳過恢復步驟，將mysqld 啓動,將數據導出來而後重建數據庫。
innodb_force_recovery能夠設置爲1-6,大的數字包含前面全部數字的影響。
1. (SRV_FORCE_IGNORE_CORRUPT):忽略檢查到的corrupt頁。
2. (SRV_FORCE_NO_BACKGROUND):阻止主線程的運行，如主線程須要執行full purge操做，會致使crash。
3. (SRV_FORCE_NO_TRX_UNDO):不執行事務回滾操做。
4. (SRV_FORCE_NO_IBUF_MERGE):不執行插入緩衝的合併操做。
5. (SRV_FORCE_NO_UNDO_LOG_SCAN):不查看重作日誌，InnoDB存儲引擎會將未提交的事務視爲已提交。
6. (SRV_FORCE_NO_LOG_REDO):不執行前滾的操做。
注意
a 當設置參數值大於0後，能夠對錶進行select,create,drop操做,但insert,update或者delete這類操做是不容許的。
b 當innodb_purge_threads 和 innodb_force_recovery一塊兒設置會出現一種loop現象:
mysql

150125 17:07:42 InnoDB: Waiting for the background threads to start
150125 17:07:43 InnoDB: Waiting for the background threads to start
150125 17:07:44 InnoDB: Waiting for the background threads to start
150125 17:07:45 InnoDB: Waiting for the background threads to start
150125 17:07:46 InnoDB: Waiting for the background threads to start
150125 17:07:47 InnoDB: Waiting for the background threads to start

在my.cnf中修改如下兩個參數
innodb_force_recovery=6
innodb_purge_thread=0

重啓MySQL
web

150125 17:10:47 [Note] Crash recovery finished.
150125 17:10:47 [Note] Server socket created on IP: '0.0.0.0'.
150125 17:10:47 [Note] Event Scheduler: Loaded 0 events
150125 17:10:47 [Note] /vdata/webserver/mysql/bin/mysqld: ready for connections.
Version: '5.5.37-MariaDB-log' socket: '/tmp/mysql.sock' port: 3306 Source distribution

當即對數據庫作邏輯導出，完成以後將innodb_force_recovery設置爲0 ，innodb_purge_thread=1 ,而後重建數據庫。
另外 MySQL 版本 5.5以及以前 ,當innodb_purge_threads =1，innodb_force_recovery >1 的狀況會出現上文提到的循環報warning 問題（=1 沒有問題），
緣由：
MySQL 的源代碼中顯示當innodb_purge_threads 和 innodb_force_recovery一塊兒設置會出現loop循環sql

while (srv_shutdown_state == SRV_SHUTDOWN_NONE) {
if (srv_thread_has_reserved_slot(SRV_MASTER) == ULINT_UNDEFINED
|| (srv_n_purge_threads == 1
&& srv_thread_has_reserved_slot(SRV_WORKER)
== ULINT_UNDEFINED)) {
ut_print_timestamp(stderr);
fprintf(stderr, " InnoDB: Waiting for the background threads to start\n");
os_thread_sleep(1000000);
} else {
break;
}
}

因此當須要設置innodb_force_recovery>1的時候須要關閉 innodb_purge_threads，設置爲0（默認）。

四小結
MySQL crash 或者 MySQL 數據庫服務器 crash 會致使各類各樣的問題，好比主備之間的error 1594 (5.6 版本開啓crash-safe ，會最大程度上避免 error 1594的問題，之後會寫5.6新特性介紹該功能 )，error 1236， 日誌損壞，數據文件損壞 ，等等，本案例只是其中的一種，細心從日誌中找的相關錯誤提示，逐步解決便可。數據庫