MySQL集羣搭建(5)-MHA高可用架構

時間 2020-01-05

標籤 mysql 集羣搭建 mha 可用架構欄目 MySQL 简体版

原文原文鏈接

前面的文章介紹了怎麼從單點開始搭建MySQL集羣，列表以下node

今天說另外一個經常使用的高可用方案: MHAmysql

1 概述

1.1 MHA 簡介

MHA - Master High Availability 是由 Perl 實現的一款高可用程序，出現故障時，MHA 以最小的停機時間(一般10-30秒)執行 master 的故障轉移以及 slave 的升級。MHA 可防止複製一致性問題，而且易於安裝，不須要改變現有部署。git

MHA 由MHA manager和MHA node組成, MHA manager是一個監控管理程序，用於監控MySQL master狀態; MHA node是具備故障轉移的工具腳本，如解析 MySQL 二進制/中繼日誌，傳輸應用事件到Slave, MHA node在每一個MySQL服務器上運行。github

出自 MHA Wiki

MHA manager調用MHA node工具腳本的方式是SSH到主機上而後執行命令，因此各節點須要作等效驗證。redis

1.2 MHA 怎麼保證數據不丟失

當Master宕機後，MHA會嘗試保存宕機Master的二進制日誌，而後自動判斷MySQL集羣中哪一個實例的中繼日誌是最新的，並將有最新日誌的實例的差別日誌傳到其餘實例補齊，從而實現全部實例數據一致。而後把宕機Master的二進制日誌應用到選定節點，並提高爲 Master。sql

具體流程以下：shell

嘗試從宕機Master中保存二進制日誌
找到含有最新中繼日誌的Slave
把最新中繼日誌應用到其餘實例，實現各實例數據一致
應用從Master保存的二進制日誌事件
提高一個Slave爲Master
其餘Slave向該新Master同步

從切換流程流程能夠看到，若是宕機Master主機沒法SSH登陸，那麼第一步就沒辦法實現，對於MySQL5.5之前的版本，數據仍是有丟失的風險。對於5.5後的版本，開啓半同步複製後，真正有助於避免數據丟失，半同步複製保證至少一個（不是全部）slave 在 master 提交時接收到二進制日誌事件。所以，對於能夠處理一致性問題的MHA 能夠實現"幾乎沒有數據丟失"和"從屬一致性"。數據庫

1.3 MHA 優勢和限制

優勢

開源，用Perl編寫
方案成熟，故障切換時，MHA會作日誌補齊操做，儘量減小數據丟失，保證數據一
部署不須要改變現有架構

限制

各個節點要打通SSH信任，有必定的安全隱患
沒有 Slave 的高可用
自帶的腳本不足，例如虛IP配置須要本身寫命令或者依賴其餘軟件
須要手動清理中繼日誌

1.4 MHA 經常使用兩種複製配置

單 master，多 slave

M(RW)
        |
+-------+-------+
S1(R)  S2(R)   S3(R)

這種複製方式很是常見，當Master宕機時，MHA會選一個日誌最新的主機升級爲Master, 若是不但願個節點成爲Master，把no_master設爲1就能夠。segmentfault

多 master, 多 slave

M(RW)----M2(R, candidate_master=1)
        |
+-------+-------+
S1(R)          S2(R)

雙主結構也是常見的複製模式，若是當前Master崩潰, MHA會選擇只讀Master成爲新的Mastercentos

2 數據庫環境準備

本次演示使用複製方式是主主從，主主從數據庫搭建方式參考之前文章

2.1 節點信息

IP	系統	端口	MySQL版本	節點	讀寫	說明
10.0.0.247	Centos6.5	3306	5.7.9	Master	讀寫	主節點
10.0.0.248	Centos6.5	3306	5.7.9	Standby	只讀,可切換爲讀寫	備主節點
10.0.0.249	Centos6.5	3306	5.7.9	Slave	只讀	從節點
10.0.0.24	Centos6.5	-	-	manager	-	MHA Manager
10.0.0.237	-	-	-	-	-	VIP

2.2 架構圖

2.3 參考配置

Master1

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2473306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 1
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Master2

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2483306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 2
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Slave

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_db/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_db
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_db/mysql.sock
pid-file = /data/mysql_db/test_db/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2493306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

read_only=1

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_db/mysql-bin
log_bin_index = /data/mysql_log/test_db/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_db/mysql-relay-bin
relay_log_index=/data/mysql_log/test_db/mysql-relay-bin.index
log_error = /data/mysql_log/test_db/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

3 安裝配置 MHA

3.1 下載 MHA

進入 MHA 下載頁面 Downloads, 下載Manager和Node節點安裝包，因爲個人服務器是centos6，因此下載了MHA Manager 0.56 rpm RHEL6和MHA Node 0.56 rpm RHEL6

3.2 安裝 MHA

Node安裝

在全部主機(包括Manager)上執行

# 安裝依賴
yum install perl perl-devel perl-DBD-MySQL
# 安裝 node 工具
rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm

Manager安裝

在 Manager 主機上執行

# 安裝依賴
yum install -y perl perl-devel perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
# 安裝 manager
rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm

MHA Installation

3.3 建立 MHA 管理用戶

管理用戶須要執行一些數據庫管理命令包括STOP SLAVE, CHANGE MASTER, RESET SLAVE

create user mha_manager@'%' identified by 'mha_manager';
grant all on *.* to mha_manager@'%';
flush privileges;

3.4 增長 MySQL 用戶 sudo 權限

配置 VIP 須要有 sudo 權限

打開/etc/sudoers文件, 增長一條

root    ALL=(ALL)       ALL
# 這個是增長的
mysql   ALL=(ALL)       NOPASSWD: ALL

而後把Defaults requiretty註釋掉

# Defaults    requiretty

3.5 配置各主機免密碼登錄

全部主機執行

# 進入 mysql 用戶
su - mysql

# 生成密鑰對, 執行命令，而後按回車
ssh-keygen -t rsa

# 複製公鑰到相應主機
ssh-copy-id mysql@10.0.0.247
ssh-copy-id mysql@10.0.0.248
ssh-copy-id mysql@10.0.0.249
ssh-copy-id mysql@10.0.1.24

3.6 配置 Manager

新建/etc/masterha目錄，咱們把配置文件放到這裏

mkdir /etc/masterha

建立配置文件/etc/masterha/app1.cnf, 寫上配置

[server default]
manager_workdir=/etc/masterha                  # 設置 manager 的工做目錄, 能夠本身調整
manager_log=/etc/masterha/manager.log          # 設置 manager 的日誌文件
master_binlog_dir=/data/mysql_log/test_db      # 設置 master binlog 的日誌的位置
master_ip_failover_script= /etc/masterha/script/master_ip_failover            # 設置自動 failover 時的切換腳本, 腳本參考附件
master_ip_online_change_script= /etc/masterha/script/master_ip_online_change  # 設置手動切換時執行的切換腳本, 腳本參考附件

user=mha_manager            # 設置管理用戶, 用來監控、配置 MySQL(STOP SLAVE, CHANGE MASTER, RESET SLAVE), 默認爲 root
password=mha_manager        # 設置管理用戶密碼

repl_user=repl              # 設置複製環境中的複製用戶名
repl_password=repl          # 設置複製用戶的密碼

ping_interval=1             # 發送 ping 包的時間間隔，三次沒有迴應就自動進行 failover
remote_workdir=/tmp         # 設置遠端 MySQL 的工做目錄

report_script=/etc/masterha/script/send_report    # 設置發生切換後執行的腳本

# 檢查腳本
secondary_check_script= /usr/bin/masterha_secondary_check-s 10.0.0.247 -s 10.0.0.248            

shutdown_script=""              #設置故障發生後關閉故障主機腳本(能夠用於防止腦裂)

ssh_user=mysql                  #設置 ssh 的登陸用戶名

[server1]
hostname=10.0.0.247
port=3306

[server2]
hostname=10.0.0.248
port=3306
candidate_master=1   # 設置爲候選 master, 若是發生宕機切換，會把該節點設爲新 Master，即便它不是數據最新的節點
check_repl_delay=0   # 默認狀況下，一個 Slave 落後 Master 100M 的中繼日誌，MHA 不會選擇它做爲新的 Master，由於這對於 Slave 恢復數據要很長時間，check_repl_delay=0 的時候會忽略延遲，能夠和 candidate_master=1 配合用

[server3]
hostname=10.0.0.249
port=3306
no_master=1         # 從不將這臺主機升級爲 Master
ignore_fail=1       # 默認狀況下，若是有 Slave 節點掛了， 就不進行切換，設置 ignore_fail=1 能夠突然它

建立配置文件/etc/masterha/app2.cnf, 以備用Master 爲 Master, 方便切換後啓動MHA

[server default]
manager_workdir=/etc/masterha                  # 設置 manager 的工做目錄, 能夠本身調整
manager_log=/etc/masterha/manager.log          # 設置 manager 的日誌文件
master_binlog_dir=/data/mysql_log/test_db      # 設置 master binlog 的日誌的位置
master_ip_failover_script= /etc/masterha/script/master_ip_failover            # 設置自動 failover 時的切換腳本
master_ip_online_change_script= /etc/masterha/script/master_ip_online_change  # 設置手動切換時執行的切換腳本

user=mha_manager            # 設置管理用戶, 用來監控、配置 MySQL(STOP SLAVE, CHANGE MASTER, RESET SLAVE), 默認爲 root
password=mha_manager        # 設置管理用戶密碼

repl_user=repl              # 設置複製環境中的複製用戶名
repl_password=repl          # 設置複製用戶的密碼

ping_interval=1             # 發送 ping 包的時間間隔，三次沒有迴應就自動進行 failover
remote_workdir=/tmp         # 設置遠端 MySQL 的工做目錄

report_script=/etc/masterha/script/send_report    # 設置發生切換後執行的腳本

# 檢查腳本
secondary_check_script= /usr/bin/masterha_secondary_check -s 10.0.0.248 -s 10.0.0.247

shutdown_script=""              #設置故障發生後關閉故障主機腳本(能夠用於防止腦裂)

ssh_user=mysql                  #設置 ssh 的登陸用戶名

[server1]
hostname=10.0.0.248
port=3306

[server2]
hostname=10.0.0.247
port=3306
candidate_master=1   # 設置爲候選 master, 若是發生宕機切換，會把該節點設爲新 Master，即便它不是數據最新的節點
check_repl_delay=0   # 默認狀況下，一個 Slave 落後 Master 100M 的中繼日誌，MHA 不會選擇它做爲新的 Master，由於這對於 Slave 恢復數據要很長時間，check_repl_delay=0 的時候會忽略延遲，能夠和 candidate_master=1 配合用

[server3]
hostname=10.0.0.249
port=3306
no_master=1         # 從不將這臺主機升級爲 Master
ignore_fail=1       # 默認狀況下，若是有 Slave 節點掛了， 就不進行切換，設置 ignore_fail=1 能夠突然它

注意：使用的時候去掉註釋

3.7 配置切換腳本

管理 VIP 方式

MHA管理VIP有兩種方案，一種是使用Keepalived，另外一種是本身寫命令實現增刪VIP，因爲Keepalived容易受到網絡波動形成VIP切換，並且沒法在多實例機器上使用，因此建議寫腳本管理VIP。

當前主機的網卡是eth0, 能夠經過下列命令增刪 VIP

up VIP

sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255

down VIP

sudo /sbin/ifconfig eth0:1 down

配置切換腳本

master_ip_failover , master_ip_online_change和send_report腳本在附錄裏面

更改 mysql 配置

MHA的檢測比較嚴格，因此咱們把除Master外的節點設爲read_only, 有必要能夠寫進配置文件裏面

# mysql shell
set global read_only=1;

MHA須要使用中繼日誌來實現數據一致性，因此全部節點要設置不自動清理中繼日誌

# mysql shell
set global relay_log_purge=0;

也能夠寫入配置文件

# my.cnf
relay_log_purge=0

MHA 經常使用命令

Manager

masterha_check_ssh              檢查 MHA 的 SSH 配置情況    
masterha_check_repl             檢查 MySQL 複製情況
masterha_manger                 啓動 MHA
masterha_stop                   中止 MHA
masterha_check_status           檢測當前 MHA 運行狀態
masterha_master_monitor         檢測 master 是否宕機
masterha_master_switch          手動故障轉移
masterha_conf_host              添加或刪除配置的 server 信息

Node

save_binary_logs                保存 master 的二進制日誌
apply_diff_relay_logs           對比識別中繼日誌的差別部分
purge_relay_logs                清除中繼日誌(MHA中繼日誌須要使用這個命令清除)

命令的使用方法能夠經過執行命令 --help 獲得

驗證 SSH 是否成功、主從狀態是否正常

在 manager 節點執行 masterha_check_ssh --conf=/etc/masterha/app1.cnf 檢測SSH狀態，下面是執行結果

[mysql@chengqm ~]$ masterha_check_ssh --conf=/etc/masterha/app1.cnf
Thu Dec 20 19:47:18 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Dec 20 19:47:18 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Dec 20 19:47:18 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Dec 20 19:47:18 2018 - [info] Starting SSH connection tests..
Thu Dec 20 19:47:19 2018 - [debug] 
Thu Dec 20 19:47:18 2018 - [debug]  Connecting via SSH from mysql@10.0.0.247(10.0.0.247:22) to mysql@10.0.0.248(10.0.0.248:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from mysql@10.0.0.247(10.0.0.247:22) to mysql@10.0.0.249(10.0.0.249:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:19 2018 - [debug] 
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from mysql@10.0.0.248(10.0.0.248:22) to mysql@10.0.0.247(10.0.0.247:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from mysql@10.0.0.248(10.0.0.248:22) to mysql@10.0.0.249(10.0.0.249:22)..
Thu Dec 20 19:47:19 2018 - [debug]   ok.
Thu Dec 20 19:47:20 2018 - [debug] 
Thu Dec 20 19:47:19 2018 - [debug]  Connecting via SSH from mysql@10.0.0.249(10.0.0.249:22) to mysql@10.0.0.247(10.0.0.247:22)..
Thu Dec 20 19:47:20 2018 - [debug]   ok.
Thu Dec 20 19:47:20 2018 - [debug]  Connecting via SSH from mysql@10.0.0.249(10.0.0.249:22) to mysql@10.0.0.248(10.0.0.248:22)..
Thu Dec 20 19:47:20 2018 - [debug]   ok.
Thu Dec 20 19:47:20 2018 - [info] All SSH connection tests passed successfully.

在 manager 節點執行 masterha_check_repl --conf=/etc/masterha/app1.cnf 檢測同步狀態，下面是執行結果

[mysql@chengqm ~]$ masterha_check_repl --conf=/etc/masterha/app1.cnf
Thu Dec 20 20:05:03 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Dec 20 20:05:03 2018 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Thu Dec 20 20:05:03 2018 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Thu Dec 20 20:05:03 2018 - [info] MHA::MasterMonitor version 0.56.
Thu Dec 20 20:05:03 2018 - [info] Multi-master configuration is detected. Current primary(writable) master is 10.0.0.247(10.0.0.247:3306)
Thu Dec 20 20:05:03 2018 - [info] Master configurations are as below: 
Master 10.0.0.247(10.0.0.247:3306), replicating from 10.0.0.248(10.0.0.248:3306)
Master 10.0.0.248(10.0.0.248:3306), replicating from 10.0.0.247(10.0.0.247:3306), read-only
================ 省略 ==================
Thu Dec 20 20:05:08 2018 - [info]   /etc/masterha/script/master_ip_failover --command=status --ssh_user=mysql --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 
Thu Dec 20 20:05:08 2018 - [info]  OK.
Thu Dec 20 20:05:08 2018 - [warning] shutdown_script is not defined.
Thu Dec 20 20:05:08 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

出現 MySQL Replication Health is OK. 表示成功

若是出現Failed to get master_ip_failover_script status with return code 255:0這個錯誤，就註釋掉master_ip_failover腳本的FIXME_xxx

注意：要想正常運行，系統路徑必需要有 mysqlbinlog 和 mysql 命令

4 啓動和測試

4.1 啓動

使用腳本管理 VIP 不會自動設置 VIP，因此先手動在 Master 設置 VIP

[root@cluster01 ~]# /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255
[root@cluster01 ~]# ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.247  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fede:8033/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:17333247 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5472004 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1476157398 (1.3 GiB)  TX bytes:1064253754 (1014.9 MiB)

eth0:1    Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.237  Bcast:10.0.0.237  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
...

啓動 MHA Manager

[mysql@chengqm ~]$ nohup /usr/bin/masterha_manager --conf=/etc/masterha/app1.cnf --ignore_last_failover &
[1] 21668

--ignore_last_failover 忽略上次切換。MHA每次故障切換後都會生成一個app1.failover.complete這樣的文件，若是不加這個參數，須要刪除這個文件才能再次啓動

檢查啓動日誌

[mysql@chengqm ~]$ tail -18 /etc/masterha/manager.log 
Fri Dec 21 13:56:39 2018 - [info] 
10.0.0.247(10.0.0.247:3306) (current master)
 +--10.0.0.248(10.0.0.248:3306)
 +--10.0.0.249(10.0.0.249:3306)

Fri Dec 21 13:56:39 2018 - [info] Checking master_ip_failover_script status:
Fri Dec 21 13:56:39 2018 - [info]   /etc/masterha/script/master_ip_failover --command=status --ssh_user=mysql --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 


 VIP Command: start=sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Check script.. OK 
Fri Dec 21 13:56:39 2018 - [info]  OK.
Fri Dec 21 13:56:39 2018 - [warning] shutdown_script is not defined.
Fri Dec 21 13:56:39 2018 - [info] Set master ping interval 1 seconds.
Fri Dec 21 13:56:39 2018 - [info] Set secondary check script: /usr/bin/masterha_secondary_check -s 10.0.0.247 -s 10.0.0.248
Fri Dec 21 13:56:39 2018 - [info] Starting ping health check on 10.0.0.247(10.0.0.247:3306)..
Fri Dec 21 13:56:39 2018 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..

日誌中顯示 Ping(SELECT) succeeded, waiting until MySQL doesn't respond 表示啓動成功

若是查看Master的general日誌，會發現MHA不斷執行SELECT 1 As Value檢查命令

4.2 失效轉移

咱們模擬Master數據庫宕機的狀況

[root@cluster01 ~]# ps -ef | grep mysql
mysql    20061     1  0 11:19 pts/0    00:00:00 /bin/sh /usr/local/mysql57/bin/mysqld_safe --defaults-file=/data/mysql_db/test_db/my.cnf --datadir=/data/mysql_db/test_db --pid-file=/data/mysql_db/test_db/mysql.pid
mysql    20494 20061  0 11:19 pts/0    00:00:21 /usr/local/mysql57/bin/mysqld --defaults-file=/data/mysql_db/test_db/my.cnf --basedir=/usr/local/mysql57 --datadir=/data/mysql_db/test_db --plugin-dir=/usr/local/mysql57/lib/plugin --log-error=/data/mysql_log/test_db/mysql-error.log --pid-file=/data/mysql_db/test_db/mysql.pid --socket=/data/mysql_db/test_db/mysql.sock --port=3306
[root@cluster01 ~]# kill -9 20061 20494

查看MHA日誌能夠看到整個切換過程

Fri Dec 21 14:04:49 2018 - [warning] Got error on MySQL select ping: 2006 (MySQL server has gone away)
Fri Dec 21 14:04:49 2018 - [info] Executing secondary network check script: /usr/bin/masterha_secondary_check -s 10.0.0.247 -s 10.0.0.248  --user=mysql  --master_host=10.0.0.247  --master_ip=10.0.0.247  --master_port=3306 --master_user=mha_manager --master_password=mha_manager --ping_type=SELECT
Fri Dec 21 14:04:49 2018 - [info] Executing SSH check script: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql_log/test_db --output_file=/tmp/save_binary_logs_test --manager_version=0.56 --binlog_prefix=mysql-bin
Monitoring server 10.0.0.247 is reachable, Master is not reachable from 10.0.0.247. OK.
Fri Dec 21 14:04:49 2018 - [info] HealthCheck: SSH to 10.0.0.247 is reachable.
Monitoring server 10.0.0.248 is reachable, Master is not reachable from 10.0.0.248. OK.
Fri Dec 21 14:04:49 2018 - [info] Master is not reachable from all other monitoring servers. Failover should start.
=============== 省略 ================
Fri Dec 21 14:04:52 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Fri Dec 21 14:04:52 2018 - [info] Executing master IP deactivation script:
Fri Dec 21 14:04:52 2018 - [info]   /etc/masterha/script/master_ip_failover --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 --command=stopssh --ssh_user=mysql  


 VIP Command: start=sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Disabling the VIP on old master: 10.0.0.247 
SIOCSIFFLAGS: Cannot assign requested address
Fri Dec 21 14:04:52 2018 - [info]  done.
=============== 省略 ================
Fri Dec 21 14:04:53 2018 - [info] Starting master failover..
Fri Dec 21 14:04:53 2018 - [info] 
From:
10.0.0.247(10.0.0.247:3306) (current master)
 +--10.0.0.248(10.0.0.248:3306)
 +--10.0.0.249(10.0.0.249:3306)

To:
10.0.0.248(10.0.0.248:3306) (new master)
 +--10.0.0.249(10.0.0.249:3306)
Fri Dec 21 14:04:53 2018 - [info]
=============== 省略 ================
Fri Dec 21 14:04:53 2018 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='10.0.0.248', MASTER_PORT=3306, MASTER_LOG_FILE='mysql-bin.000005', MASTER_LOG_POS=154, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Fri Dec 21 14:04:53 2018 - [info] Executing master IP activate script:
Fri Dec 21 14:04:53 2018 - [info]   /etc/masterha/script/master_ip_failover --command=start --ssh_user=mysql --orig_master_host=10.0.0.247 --orig_master_ip=10.0.0.247 --orig_master_port=3306 --new_master_host=10.0.0.248 --new_master_ip=10.0.0.248 --new_master_port=3306 --new_master_user='mha_manager' --new_master_password='mha_manager'  


 VIP Command: start=sudo /sbin/ifconfig eth0:1 10.0.0.237 netmask 255.255.255.255 stop=sudo /sbin/ifconfig eth0:1 down

Set read_only=0 on the new master.
Enabling the VIP - 10.0.0.237 on the new master - 10.0.0.248 
=============== 省略 ================
Fri Dec 21 14:04:55 2018 - [info]  10.0.0.248: Resetting slave info succeeded.
Fri Dec 21 14:04:55 2018 - [info] Master failover to 10.0.0.248(10.0.0.248:3306) completed successfully.

查看新Master VIP

[mysql@cluster02 ~]$ ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:66:7E:E8  
          inet addr:10.0.0.248  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fe66:7ee8/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:40197173 errors:0 dropped:0 overruns:0 frame:0
          TX packets:10470689 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:4063358126 (3.7 GiB)  TX bytes:2269241789 (2.1 GiB)

eth0:1    Link encap:Ethernet  HWaddr FA:16:3E:66:7E:E8  
          inet addr:10.0.0.237  Bcast:10.0.0.237  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

能夠看到VIP已經成功切換

查看新Master的general日誌，能夠看到MHA的操做過程, 下面展現部分日誌

...
2018-12-21T14:04:41.782336+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:41.788318+08:00 5525 Query    STOP SLAVE IO_THREAD
2018-12-21T14:04:41.900734+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:42.044801+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:42.668581+08:00 5525 Query    SHOW SLAVE STATUS
2018-12-21T14:04:42.670336+08:00 5525 Query    STOP SLAVE SQL_THREAD
...
2018-12-21T14:04:42.863904+08:00 5526 Query    SET GLOBAL read_only=0
...
2018-12-21T14:04:43.950986+08:00 5527 Query    SET @rpl_semi_sync_slave= 1
...

查看Slave的general日誌，能夠看到Slave會從新指向

2018-12-21T14:04:04.835218+08:00   90 Query    STOP SLAVE IO_THREAD
2018-12-21T14:04:04.955706+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:05.092123+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.018838+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.034225+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.036613+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.038475+08:00   90 Query    STOP SLAVE SQL_THREAD
2018-12-21T14:04:06.160142+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.162224+08:00   90 Query    STOP SLAVE
2018-12-21T14:04:06.163171+08:00   90 Query    SHOW SLAVE STATUS
2018-12-21T14:04:06.164554+08:00   90 Query    RESET SLAVE
2018-12-21T14:04:06.825564+08:00   90 Query    CHANGE MASTER TO MASTER_HOST = '10.0.0.248' MASTER_USER = 'repl' MASTER_PASSWORD = <secret> MASTER_PORT = 3306 MASTER_LOG_FILE = 'mysql-bin.000005' MASTER_LOG_POS = 154
2018-12-21T14:04:06.981718+08:00   90 Query    SET GLOBAL relay_log_purge=0
2018-12-21T14:04:06.982802+08:00   90 Query    START SLAVE

注意: MHA在切換完成後會結束 Manager 進程

4.3 手動切換

切換後Master爲Cluster2, 把Cluster1從新指向Cluster2，如今測試一下手動切換，把Master切回Cluster1, 命令以下

masterha_master_switch --conf=/etc/masterha/app2.cnf --master_state=alive --new_master_host=10.0.0.247 --new_master_port=3306 --orig_master_is_new_slave

--orig_master_is_new_slave 是將原master切換爲新主的slave，默認狀況下，是不添加的。

下面是執行過程, 有兩個地方要回答 yes/no

[mysql@chengqm ~]$ masterha_master_switch --conf=/etc/masterha/app2.cnf --master_state=alive --new_master_host=10.0.0.247 --new_master_port=3306 --orig_master_is_new_slave

......

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 10.0.0.248(10.0.0.248:3306)? (YES/no): yes

......

Sun Dec 23 16:50:48 2018 - [info] 
From:
10.0.0.248(10.0.0.248:3306) (current master)
 +--10.0.0.247(10.0.0.247:3306)
 +--10.0.0.249(10.0.0.249:3306)

To:
10.0.0.247(10.0.0.247:3306) (new master)
 +--10.0.0.249(10.0.0.249:3306)
 +--10.0.0.248(10.0.0.248:3306)

Starting master switch from 10.0.0.248(10.0.0.248:3306) to 10.0.0.247(10.0.0.247:3306)? (yes/NO): yes

......

Sun Dec 23 16:51:36 2018 - [info]  10.0.0.247: Resetting slave info succeeded.
Sun Dec 23 16:51:36 2018 - [info] Switching master to 10.0.0.247(10.0.0.247:3306) completed successfully.

切換成功，查看Cluster1的VIP

[mysql@cluster01 ~]$ ifconfig
eth0      Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.247  Bcast:10.0.255.255  Mask:255.255.0.0
          inet6 addr: fe80::f816:3eff:fede:8033/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:20585872 errors:0 dropped:0 overruns:0 frame:0
          TX packets:5519122 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:1785787985 (1.6 GiB)  TX bytes:1068115408 (1018.6 MiB)

eth0:1    Link encap:Ethernet  HWaddr FA:16:3E:DE:80:33  
          inet addr:10.0.0.237  Bcast:10.0.0.237  Mask:255.255.255.255
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

注意：手動切換的時候先把 MHA Manager 停了

4.4 中止 MHA

中止 MHA 的命令以下，就不演示了

masterha_stop --conf=配置文件

5 總結

總的來講，MHA是一套很是優秀並且使用比較廣的高可用程序，它能夠自動補齊日誌使得一致性有保證，部署的時候不須要改變原有架構就可使用。可是使用起來仍是有一點複雜的，由於MHA不接管VIP，因此要本身寫腳本實現，並且只保證Master高可用，沒有Slave高可用，還有就是中繼日誌要本身設定時任務來清理。

無論怎麼說，在沒有更好的方案下，MHA仍是值得使用的。

附

master_ip_failover 腳本

#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;

my (
  $command,        $ssh_user,         $orig_master_host,
  $orig_master_ip, $orig_master_port, $new_master_host,
  $new_master_ip,  $new_master_port,  $new_master_user,
  $new_master_password
);

my $vip = '10.0.0.237';
my $key = '1';
my $ssh_start_vip = "sudo /sbin/ifconfig eth0:$key $vip netmask 255.255.255.255";
my $ssh_stop_vip = "sudo /sbin/ifconfig eth0:$key down";

GetOptions(
  'command=s'             => \$command,
  'ssh_user=s'            => \$ssh_user,
  'orig_master_host=s'    => \$orig_master_host,
  'orig_master_ip=s'      => \$orig_master_ip,
  'orig_master_port=i'    => \$orig_master_port,
  'new_master_host=s'     => \$new_master_host,
  'new_master_ip=s'       => \$new_master_ip,
  'new_master_port=i'     => \$new_master_port,
  'new_master_user=s'     => \$new_master_user,
  'new_master_password=s' => \$new_master_password,
);

exit &main();

sub main {

  print "\n\n VIP Command: start=$ssh_start_vip stop=$ssh_stop_vip\n\n";
 
  if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.
    # If you manage master ip address at global catalog database,
    # invalidate orig_master_ip here.
    my $exit_code = 1;
    eval {
      print "Disabling the VIP on old master: $orig_master_host \n";
      &stop_vip();
      # updating global catalog, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {

    # all arguments are passed.
    # If you manage master ip address at global catalog database,
    # activate new_master_ip here.
    # You can also grant write access (create user, set read_only=0, etc) here.
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      $new_master_handler->disable_log_bin_local();
      print "Set read_only=0 on the new master.\n";
      $new_master_handler->disable_read_only();
      $new_master_handler->disconnect();

      print "Enabling the VIP - $vip on the new master - $new_master_host \n";
      &start_vip();

      $exit_code = 0;
    };
    if ($@) {
      warn $@;

      # If you want to continue failover, exit 10.
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {
    print "Check script.. OK \n";
    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

sub start_vip() {
    `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($ssh_user);
    `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
  print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

master_ip_online_change 腳本

#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.
#
#  This program is free software; you can redistribute it and/or modify
#  it under the terms of the GNU General Public License as published by
#  the Free Software Foundation; either version 2 of the License, or
#  (at your option) any later version.
#
#  This program is distributed in the hope that it will be useful,
#  but WITHOUT ANY WARRANTY; without even the implied warranty of
#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#  GNU General Public License for more details.
#
#  You should have received a copy of the GNU General Public License
#   along with this program; if not, write to the Free Software
#  Foundation, Inc.,
#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;
use warnings FATAL => 'all';

use Getopt::Long;
use MHA::DBHelper;
use MHA::NodeUtil;
use Time::HiRes qw( sleep gettimeofday tv_interval );
use Data::Dumper;

my $_tstart;
my $_running_interval = 0.1;
my (
  $command,              $orig_master_is_new_slave, $orig_master_host,
  $orig_master_ip,       $orig_master_port,         $orig_master_user,
  $orig_master_password, $orig_master_ssh_user,     $new_master_host,
  $new_master_ip,        $new_master_port,          $new_master_user,
  $new_master_password,  $new_master_ssh_user
);
GetOptions(
  'command=s'                => \$command,
  'orig_master_is_new_slave' => \$orig_master_is_new_slave,
  'orig_master_host=s'       => \$orig_master_host,
  'orig_master_ip=s'         => \$orig_master_ip,
  'orig_master_port=i'       => \$orig_master_port,
  'orig_master_user=s'       => \$orig_master_user,
  'orig_master_password=s'   => \$orig_master_password,
  'orig_master_ssh_user=s'   => \$orig_master_ssh_user,
  'new_master_host=s'        => \$new_master_host,
  'new_master_ip=s'          => \$new_master_ip,
  'new_master_port=i'        => \$new_master_port,
  'new_master_user=s'        => \$new_master_user,
  'new_master_password=s'    => \$new_master_password,
  'new_master_ssh_user=s'    => \$new_master_ssh_user,
);

my $vip = '10.0.0.237';
my $key = '1';
my $ssh_start_vip = "sudo /sbin/ifconfig eth0:$key $vip netmask 255.255.255.255";
my $ssh_stop_vip = "sudo /sbin/ifconfig eth0:$key down";

exit &main();

sub current_time_us {
  my ( $sec, $microsec ) = gettimeofday();
  my $curdate = localtime($sec);
  return $curdate . " " . sprintf( "%06d", $microsec );
}

sub sleep_until {
  my $elapsed = tv_interval($_tstart);
  if ( $_running_interval > $elapsed ) {
    sleep( $_running_interval - $elapsed );
  }
}

sub get_threads_util {
  my $dbh                    = shift;
  my $my_connection_id       = shift;
  my $running_time_threshold = shift;
  my $type                   = shift;
  $running_time_threshold = 0 unless ($running_time_threshold);
  $type                   = 0 unless ($type);
  my @threads;

  my $sth = $dbh->prepare("SHOW PROCESSLIST");
  $sth->execute();

  while ( my $ref = $sth->fetchrow_hashref() ) {
    my $id         = $ref->{Id};
    my $user       = $ref->{User};
    my $host       = $ref->{Host};
    my $command    = $ref->{Command};
    my $state      = $ref->{State};
    my $query_time = $ref->{Time};
    my $info       = $ref->{Info};
    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);
    next if ( $my_connection_id == $id );
    next if ( defined($query_time) && $query_time < $running_time_threshold );
    next if ( defined($command)    && $command eq "Binlog Dump" );
    next if ( defined($user)       && $user eq "system user" );
    next
      if ( defined($command)
      && $command eq "Sleep"
      && defined($query_time)
      && $query_time >= 1 );

    if ( $type >= 1 ) {
      next if ( defined($command) && $command eq "Sleep" );
      next if ( defined($command) && $command eq "Connect" );
    }

    if ( $type >= 2 ) {
      next if ( defined($info) && $info =~ m/^select/i );
      next if ( defined($info) && $info =~ m/^show/i );
    }

    push @threads, $ref;
  }
  return @threads;
}

sub main {
  if ( $command eq "stop" ) {
    ## Gracefully killing connections on the current master
    # 1. Set read_only= 1 on the new master
    # 2. DROP USER so that no app user can establish new connections
    # 3. Set read_only= 1 on the current master
    # 4. Kill current queries
    # * Any database access failure will result in script die.
    my $exit_code = 1;
    eval {
      ## Setting read_only=1 on the new master (to avoid accident)
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );
      print current_time_us() . " Set read_only on the new master.. ";
      $new_master_handler->enable_read_only();
      if ( $new_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }
      $new_master_handler->disconnect();

      # Connecting to the orig master, die if any database error happens
      my $orig_master_handler = new MHA::DBHelper();
      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,
        $orig_master_user, $orig_master_password, 1 );

      $orig_master_handler->disable_log_bin_local();

      ## Waiting for N * 100 milliseconds so that current connections can exit
      my $time_until_read_only = 15;
      $_tstart = [gettimeofday];
      my @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_read_only > 0 && $#threads >= 0 ) {
        if ( $time_until_read_only % 5 == 0 ) {
          printf
"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_read_only * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_read_only--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write
      print current_time_us() . " Set read_only=1 on the orig master.. ";
      $orig_master_handler->enable_read_only();
      if ( $orig_master_handler->is_read_only() ) {
        print "ok.\n";
      }
      else {
        die "Failed!\n";
      }

      ## Waiting for M * 100 milliseconds so that current update queries can complete
      my $time_until_kill_threads = 5;
      @threads = get_threads_util( $orig_master_handler->{dbh},
        $orig_master_handler->{connection_id} );
      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {
        if ( $time_until_kill_threads % 5 == 0 ) {
          printf
"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",
            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;
          if ( $#threads < 5 ) {
            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"
              foreach (@threads);
          }
        }
        sleep_until();
        $_tstart = [gettimeofday];
        $time_until_kill_threads--;
        @threads = get_threads_util( $orig_master_handler->{dbh},
          $orig_master_handler->{connection_id} );
      }

      ## Terminating all threads
      print current_time_us() . " Killing all application threads..\n";
      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );
      print current_time_us() . " done.\n";
      $orig_master_handler->enable_log_bin_local();
      $orig_master_handler->disconnect();

      print "Disabling the VIP on old master: $orig_master_host \n";
      &stop_vip();

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "start" ) {
    ## Activating master ip on the new master
    # 1. Create app user with write privileges
    # 2. Moving backup script if needed
    # 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.
# If exit code is 0 or 10, MHA does not abort
    my $exit_code = 10;
    eval {
      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not
      $new_master_handler->connect( $new_master_ip, $new_master_port,
        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master
      $new_master_handler->disable_log_bin_local();
      print current_time_us() . " Set read_only=0 on the new master.\n";
      $new_master_handler->disable_read_only();

      $new_master_handler->enable_log_bin_local();
      $new_master_handler->disconnect();

      print "Enabling the VIP - $vip on the new master - $new_master_host \n";
      &start_vip();

      ## Update master ip on the catalog database, etc
      $exit_code = 0;
    };
    if ($@) {
      warn "Got Error: $@\n";
      exit $exit_code;
    }
    exit $exit_code;
  }
  elsif ( $command eq "status" ) {

    # do nothing
    exit 0;
  }
  else {
    &usage();
    exit 1;
  }
}

sub start_vip() {
    return 0  unless  ($new_master_ssh_user);
    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
     return 0  unless  ($orig_master_ssh_user);
    `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
  print
"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
  die;
}

send_report 腳本

#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body, $title, $content);
GetOptions(
  'orig_master_host=s' => \$dead_master_host,
  'new_master_host=s'  => \$new_master_host,
  'new_slave_hosts=s'  => \$new_slave_hosts,
  'subject=s'          => \$subject,
  'body=s'             => \$body,
);

# 調用外部腳本
$title="[mha switch]";
$content="`date +'%Y-%m-%d %H:%M'` old_master=".$dead_master_host." new_master=".$new_master_host;
system("sh /etc/masterha/script/send_report.sh $title $content");

exit 0;

清理中繼日誌定時任務

下面是個人定時任務，參數自行替換, workdir 須要和中繼日誌在同一個盤

# 每小時清理一次
0 * * * * (/usr/bin/purge_relay_logs --user=mha_manager --password=mha_manager --disable_relay_log_purge --port=3306 --workdir=/tmp/relaylogtmp >> /var/log/purge_relay_logs.log 2>&1)