簡介
MHA(Master High Availability)目前在MySQL高可用方面是一個相對成熟的解決方案,它由日本DeNA公司youshimaton(現就任於Facebook公司)開發,是一套優秀的做爲MySQL高可用性環境下故障切換和主從提高的高可用軟件。在MySQL故障切換過程當中,MHA能作到在0~30秒以內自動完成數據庫的故障切換操做,而且在進行故障切換的過程當中,MHA能在最大程度上保證數據的一致性,以達到真正意義上的高可用。node
該軟件由兩部分組成:MHA Manager(管理節點)和MHA Node(數據節點)。MHA Manager能夠單獨部署在一臺獨立的機器上管理多個master-slave集羣,也能夠部署在一臺slave節點上。MHA Node運行在每臺MySQL服務器上,MHA Manager會定時探測集羣中的master節點,當master出現故障時,它能夠自動將最新數據的slave提高爲新的master,而後將全部其餘的slave從新指向新的master。整個故障轉移過程對應用程序徹底透明。mysql
在MHA自動故障切換過程當中,MHA試圖從宕機的主服務器上保存二進制日誌,最大程度的保證數據的不丟失,但這並不老是可行的。例如,若是主服務器硬件故障或沒法經過ssh訪問,MHA無法保存二進制日誌,只進行故障轉移而丟失了最新的數據。使用MySQL 5.5的半同步複製,能夠大大下降數據丟失的風險。MHA能夠與半同步複製結合起來。若是隻有一個slave已經收到了最新的二進制日誌,MHA能夠將最新的二進制日誌應用於其餘全部的slave服務器上,所以能夠保證全部節點的數據一致性。linux
目前MHA主要支持一主多從的架構,要搭建MHA,要求一個複製集羣中必須最少有三臺數據庫服務器,一主二從,即一臺充當master,一臺充當備用master,另一臺充當從庫,由於至少須要三臺服務器,出於機器成本的考慮,淘寶也在該基礎上進行了改造,目前淘寶TMHA已經支持一主一從。MHA 適合任何存儲引擎, 只要能主從複製的存儲引擎它都支持,不限於支持事物的 innodb 引擎。git
官方介紹:https://code.google.com/p/mysql-master-ha/github
下圖展現如何經過MHA Manager管理多組主從複製
能夠將MHA工做原理總結爲以下:
(1)從宕機崩潰的master保存二進制日誌事件(binlog events);
(2)識別含有最新更新的slave;
(3)應用差別的中繼日誌(relay log)到其餘的slave;
(4)應用從master保存的二進制日誌事件(binlog events);
(5)提高一個slave爲新的master;
(6)使其餘的slave鏈接新的master進行復制;sql
MHA軟件由兩部分組成,Manager工具包和Node工具包,具體的說明以下。docker
Manager工具包主要包括如下幾個工具:
masterha_check_ssh 檢查MHA的SSH配置情況
masterha_check_repl 檢查MySQL複製情況
masterha_manger 啓動MHA
masterha_check_status 檢測當前MHA運行狀態
masterha_master_monitor 檢測master是否宕機
masterha_master_switch 控制故障轉移(自動或者手動)
masterha_conf_host 添加或刪除配置的server信息數據庫
Node工具包(這些工具一般由MHA Manager的腳本觸發,無需人爲操做)主要包括如下幾個工具:
save_binary_logs 保存和複製master的二進制日誌
apply_diff_relay_logs 識別差別的中繼日誌事件並將其差別的事件應用於其餘的slave
filter_mysqlbinlog 去除沒必要要的ROLLBACK事件(MHA已再也不使用這個工具)
purge_relay_logs 清除中繼日誌(不會阻塞SQL線程)vim
一.項目規劃
主機 ip
主庫 (master) 192.168.60.206
從庫 (slave1) 192.168.60.208
從庫 (slave2) 192.168.60.209
虛擬VIP(vrrp漂移) 192.168.60.220centos
說明一下服務器狀況:
一共三臺服務器,系統CentOS Linux release 7.8
IP分別爲:192.168.60.20六、 192.168.60.20八、192.168.60.209
206爲master節點,208和209都是slave節點
mha-manager裝在209上,三臺機器都裝了mha-node
5.三臺MySQL版本爲5.7
二.前期準備
配置hosts(三臺都要解析)
cat /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.60.206 master
192.168.60.208 slave1
192.168.60.209 slave2
關閉防火牆和selinux (三臺服務器執行一樣的操做)
systemctl stop firewalld
systemctl disable firewalld
setenforce 0
sed -i ' /^SELINUX/s#enforcing#disabled#g' /etc/selinux/config
配置免密登陸(三臺服務器執行一樣的操做)
vim ssh.sh
#!/bin/bash
yum -y install sshpass &> /dev/null
read -p "請輸入服務器密碼:" passwd
UserName=root
IP="192.168.60."
#建立密鑰
ssh-keygen -t dsa -f ~/.ssh/id_dsa -P "" &>/dev/null
#分發公鑰
for i in 206 208 209 #這裏的要改爲本身機子的ip
do
sshpass -p "$passwd" ssh-copy-id -i ~/.ssh/id_dsa.pub -p 22 -o StrictHostKeyChecking=no $UserName@$IP$i &>/dev/null
done
各節點驗證ssh免密
master
[root@master ~]# ssh 192.168.60.206 date
Thu Aug 27 15:27:28 CST 2020
[root@master ~]# ssh 192.168.60.208 date
Thu Aug 27 15:27:56 CST 2020
[root@master ~]# ssh 192.168.60.209 date
Thu Aug 27 15:28:12 CST 2020
slave1
[root@slave1 ~]# ssh 192.168.60.206 date
Thu Aug 27 15:29:08 CST 2020
[root@slave1 ~]# ssh 192.168.60.208 date
Thu Aug 27 15:29:14 CST 2020
[root@slave1 ~]# ssh 192.168.60.209 date
Thu Aug 27 15:29:18 CST 2020
slave2
[root@slave2 ~]# ssh 192.168.60.206 date
Thu Aug 27 15:30:29 CST 2020
[root@slave2 ~]# ssh 192.168.60.208 date
Thu Aug 27 15:30:32 CST 2020
[root@slave2 ~]# ssh 192.168.60.209 date
Thu Aug 27 15:30:35 CST 2020
三.Mysql環境搭建
三臺服務器都要裝mysql5.7
下載並安裝MySQL官方的 Yum Repository
wget -i -c http://dev.mysql.com/get/mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql57-community-release-el7-10.noarch.rpm
yum -y install mysql-community-server
MySQL數據庫設置
首先啓動MySQL
systemctl start mysqld.service:
systemctl status mysqld.service
systemctl enable mysqld.service
此時MySQL已經開始正常運行,不過要想進入MySQL還得先找出此時root用戶的密碼,經過以下命令能夠在日誌文件中找出密碼:
grep 「password」 /var/log/mysqld.log
以下命令進入數據庫:
mysql -uroot -p # 回車後會提示輸入密碼
輸入初始密碼,此時不能作任何事情,由於MySQL默認必須修改密碼以後才能操做數據庫:
mysql> ALTER USER ‘root’@’localhost’ IDENTIFIED BY ‘123456’;
這裏有個問題,新密碼設置的時候若是設置的過於簡單會報錯:
緣由是由於MySQL有密碼設置的規範,具體是與validate_password_policy的值有關:
這時候咱們要把密碼規則改一下,執行下面sql就能夠了:
mysql> set global validate_password_policy=0;
Query OK, 0 rows affected (0.00 sec)
mysql> set global validate_password_length=1;
Query OK, 0 rows affected (0.00 sec)
mysql> ALTER USER ‘root’@’localhost’ IDENTIFIED BY ‘123456’;
Query OK, 0 rows affected (0.00 sec)
設置以後就是我上面查出來的那幾個值了,此時密碼就能夠設置的很簡單,例如1234之類的。到此數據庫的密碼設置就完成了。
但此時還有一個問題,就是由於安裝了Yum Repository,之後每次yum操做都會自動更新,須要把這個卸載掉:
yum -y remove mysql57-community-release-el7-10.noarch
配置算是完成了
四.配置MySQL主從環境
1.建立一個數據庫
mysql> create database mydb charset utf8;
2.備份數據
mysqldump -uroot -p mydb >mydb.sql
3.分發到其餘兩臺服務器上
scp mydb.sql 192.168.60.208:/root
scp mydb.sql 192.168.60.209:/root
4.導入數據(208和209都要執行)
mysql -uroot -p mydb < mydb.sql
5.修改my.cnf,重啓mysql服務(三臺都要作修改,只有server-id修改成不同的就好啦)
[root@master ~]# cat /etc/my.cnf
[mysqld]
datadir=/var/lib/mysql
socket=/var/lib/mysql/mysql.sock
symbolic-links=0
validate_password_policy=0
validate_password_length=6
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
log_bin=binlog
expire_logs_days=7
max_binlog_size=200M
relay_log_purge = 0
log_slave_updates = 1
server-id=1
binlog-do-db=mydb
binlog-ignore-db=mysql
6.在 3 個 mysql 節點作受權配置(主從複製受權)
mysql> grant replication slave on . to 'repl'@'192.168.60.%' identified by '123456';#受權主從同步用戶
mysql> grant all on . to 'mha'@'192.168.60.%' identified by '123456'; #受權MHA管理用戶,很重要
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
7.在兩個salve節點上執行,只讀限制(防止意外被寫數據,但對超級管理員不生效,很重要)
mysql> set global read_only=1;
8.在主master上查看狀態
mysql> show master status;
+---------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+---------------+----------+--------------+------------------+-------------------+
| binlog.000011 | 154 | mydb | mysql | |
+---------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
9.在兩個slave節點(208和209)執行下面的操做
mysql> change master to master_host='192.168.60.206', master_user='repl', master_password='123456',master_log_file='binlog.000011', master_log_pos=154;
mysql> start slave;
10.查看主從狀態兩個線程YES表明正常(208和209都要看)
11.驗證主從同步,主庫寫入數據
mysql> use mydb;
Database changed
mysql> create table test(id int primary key);
Query OK, 0 rows affected (0.03 sec)
mysql> insert into test values(1);
Query OK, 1 row affected (0.02 sec)
mysql> insert into test values(2);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test values(3);
Query OK, 1 row affected (0.01 sec)
mysql> insert into test values(4);
Query OK, 1 row affected (0.00 sec)
mysql> insert into test values(5);
Query OK, 1 row affected (0.00 sec)
mysql> select * from mydb.test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
5 rows in set (0.00 sec)
slave1節點驗證
mysql> select * from mydb.test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
5 rows in set (0.00 sec)
slave2節點驗證
mysql> select * from mydb.test;
+----+
| id |
+----+
| 1 |
| 2 |
| 3 |
| 4 |
| 5 |
+----+
5 rows in set (0.01 sec)
五.部署MHA集羣架構
1.安裝MHA軟件(在三個節點上都裝mha的node軟件)
先安裝依賴
wget http://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm
rpm -ivh epel-release-latest-7.noarch.rpm
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager -y
下載軟件(方式任選其一)
wget https://qiniu.wsfnk.com/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58-0.el7.centos.noarch.rpm
rpm -ivh mha4mysql-node-0.58-0.el7.centos.noarch.rpm
2.slave2上配置MHA(注意manager節點在slave2上配置)
wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
rpm -ivh mha4mysql-manager-0.58-0.el7.centos.noarch.rpm
3.建立目錄準備配置文件
mkdir -p /etc/mha
mkdir -p /var/log/mha/app1
mkdir -p /etc/mha/scripts/
vim /etc/mha/app1.cnf
[server default]
manager_log=/var/log/mha/app1/manager.log
manager_workdir=/var/log/mha/app1
master_binlog_dir=/var/lib/mysql
master_ip_failover_script=/etc/mha/scripts/master_ip_failover
password=123456
ping_interval=2
repl_password=123456
repl_user=repl
ssh_user=root
user=mha
[server1]
hostname=192.168.60.206
port=3306
[server2]
hostname=192.168.60.208
port=3306
[server3]
hostname=192.168.60.209
port=3306
ignore_fail=1 #若是這個節點掛了,mha將不可用,加上這個參數,slave掛了同樣能夠用
no_master=1 #從不將這臺主機轉換爲master
六.驗證MHA狀態
驗證SSH狀態
[root@slave2 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
Thu Aug 27 17:23:46 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug 27 17:23:46 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Aug 27 17:23:46 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Aug 27 17:23:46 2020 - [info] Starting SSH connection tests..
Thu Aug 27 17:23:47 2020 - [debug]
Thu Aug 27 17:23:46 2020 - [debug] Connecting via SSH from root@192.168.60.206(192.168.60.206:22) to root@192.168.60.208(192.168.60.208:22)..
Thu Aug 27 17:23:46 2020 - [debug] ok.
Thu Aug 27 17:23:46 2020 - [debug] Connecting via SSH from root@192.168.60.206(192.168.60.206:22) to root@192.168.60.209(192.168.60.209:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:48 2020 - [debug]
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.208(192.168.60.208:22) to root@192.168.60.206(192.168.60.206:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.208(192.168.60.208:22) to root@192.168.60.209(192.168.60.209:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:49 2020 - [debug]
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.209(192.168.60.209:22) to root@192.168.60.206(192.168.60.206:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.209(192.168.60.209:22) to root@192.168.60.208(192.168.60.208:22)..
Thu Aug 27 17:23:48 2020 - [debug] ok.
Thu Aug 27 17:23:49 2020 - [info] All SSH connection tests passed successfully.
注意:出現All SSH connection tests passed successfully 表明各個節點免密是成功的
驗證數據庫集羣狀態
[root@slave2 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
Thu Aug 27 17:23:46 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug 27 17:23:46 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Aug 27 17:23:46 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Aug 27 17:23:46 2020 - [info] Starting SSH connection tests..
Thu Aug 27 17:23:47 2020 - [debug]
Thu Aug 27 17:23:46 2020 - [debug] Connecting via SSH from root@192.168.60.206(192.168.60.206:22) to root@192.168.60.208(192.168.60.208:22)..
Thu Aug 27 17:23:46 2020 - [debug] ok.
Thu Aug 27 17:23:46 2020 - [debug] Connecting via SSH from root@192.168.60.206(192.168.60.206:22) to root@192.168.60.209(192.168.60.209:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:48 2020 - [debug]
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.208(192.168.60.208:22) to root@192.168.60.206(192.168.60.206:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.208(192.168.60.208:22) to root@192.168.60.209(192.168.60.209:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:49 2020 - [debug]
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.209(192.168.60.209:22) to root@192.168.60.206(192.168.60.206:22)..
Thu Aug 27 17:23:47 2020 - [debug] ok.
Thu Aug 27 17:23:47 2020 - [debug] Connecting via SSH from root@192.168.60.209(192.168.60.209:22) to root@192.168.60.208(192.168.60.208:22)..
Thu Aug 27 17:23:48 2020 - [debug] ok.
Thu Aug 27 17:23:49 2020 - [info] All SSH connection tests passed successfully.
[root@slave2 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
Thu Aug 27 17:30:48 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Thu Aug 27 17:30:48 2020 - [info] Reading application default configuration from /etc/mha/app1.cnf..
Thu Aug 27 17:30:48 2020 - [info] Reading server configuration from /etc/mha/app1.cnf..
Thu Aug 27 17:30:48 2020 - [info] MHA::MasterMonitor version 0.58.
Thu Aug 27 17:30:50 2020 - [info] GTID failover mode = 0
Thu Aug 27 17:30:50 2020 - [info] Dead Servers:
Thu Aug 27 17:30:50 2020 - [info] Alive Servers:
Thu Aug 27 17:30:50 2020 - [info] 192.168.60.206(192.168.60.206:3306)
Thu Aug 27 17:30:50 2020 - [info] 192.168.60.208(192.168.60.208:3306)
Thu Aug 27 17:30:50 2020 - [info] 192.168.60.209(192.168.60.209:3306)
Thu Aug 27 17:30:50 2020 - [info] Alive Slaves:
Thu Aug 27 17:30:50 2020 - [info] 192.168.60.208(192.168.60.208:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 27 17:30:50 2020 - [info] Replicating from 192.168.60.206(192.168.60.206:3306)
Thu Aug 27 17:30:50 2020 - [info] 192.168.60.209(192.168.60.209:3306) Version=5.7.31-log (oldest major version between slaves) log-bin:enabled
Thu Aug 27 17:30:50 2020 - [info] Replicating from 192.168.60.206(192.168.60.206:3306)
Thu Aug 27 17:30:50 2020 - [info] Not candidate for the new Master (no_master is set)
Thu Aug 27 17:30:50 2020 - [info] Current Alive Master: 192.168.60.206(192.168.60.206:3306)
Thu Aug 27 17:30:50 2020 - [info] Checking slave configurations..
Thu Aug 27 17:30:50 2020 - [info] read_only=1 is not set on slave 192.168.60.208(192.168.60.208:3306).
Thu Aug 27 17:30:50 2020 - [info] read_only=1 is not set on slave 192.168.60.209(192.168.60.209:3306).
Thu Aug 27 17:30:50 2020 - [info] Checking replication filtering settings..
Thu Aug 27 17:30:50 2020 - [info] binlog_do_db= mydb, binlog_ignore_db= mysql
Thu Aug 27 17:30:50 2020 - [info] Replication filtering check ok.
Thu Aug 27 17:30:50 2020 - [info] GTID (with auto-pos) is not supported
Thu Aug 27 17:30:50 2020 - [info] Starting SSH connection tests..
Thu Aug 27 17:30:52 2020 - [info] All SSH connection tests passed successfully.
Thu Aug 27 17:30:52 2020 - [info] Checking MHA Node version..
Thu Aug 27 17:30:53 2020 - [info] Version check ok.
Thu Aug 27 17:30:53 2020 - [info] Checking SSH publickey authentication settings on the current master..
Thu Aug 27 17:30:53 2020 - [info] HealthCheck: SSH to 192.168.60.206 is reachable.
Thu Aug 27 17:30:53 2020 - [info] Master MHA Node version is 0.58.
Thu Aug 27 17:30:53 2020 - [info] Checking recovery script configurations on 192.168.60.206(192.168.60.206:3306)..
Thu Aug 27 17:30:53 2020 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/var/tmp/save_binary_logs_test --manager_version=0.58 --start_file=binlog.000011
Thu Aug 27 17:30:53 2020 - [info] Connecting to root@192.168.60.206(192.168.60.206:22)..
Creating /var/tmp if not exists.. ok.
Checking output directory is accessible or not..
ok.
Binlog found at /var/lib/mysql, up to binlog.000011
Thu Aug 27 17:30:53 2020 - [info] Binlog setting check done.
Thu Aug 27 17:30:53 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Thu Aug 27 17:30:53 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.60.208 --slave_ip=192.168.60.208 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.31-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Thu Aug 27 17:30:53 2020 - [info] Connecting to root@192.168.60.208(192.168.60.208:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to slave1-relay-bin.000002
Temporary relay log file is /var/lib/mysql/slave1-relay-bin.000002
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Aug 27 17:30:54 2020 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.60.209 --slave_ip=192.168.60.209 --slave_port=3306 --workdir=/var/tmp --target_version=5.7.31-log --manager_version=0.58 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx
Thu Aug 27 17:30:54 2020 - [info] Connecting to root@192.168.60.209(192.168.60.209:22)..
Checking slave recovery environment settings..
Opening /var/lib/mysql/relay-log.info ... ok.
Relay log found at /var/lib/mysql, up to slave2-relay-bin.000002
Temporary relay log file is /var/lib/mysql/slave2-relay-bin.000002
Checking if super_read_only is defined and turned on.. not present or turned off, ignoring.
Testing mysql connection and privileges..
mysql: [Warning] Using a password on the command line interface can be insecure.
done.
Testing mysqlbinlog output.. done.
Cleaning up test file(s).. done.
Thu Aug 27 17:30:54 2020 - [info] Slaves settings check done.
Thu Aug 27 17:30:54 2020 - [info]
192.168.60.206(192.168.60.206:3306) (current master)
+--192.168.60.208(192.168.60.208:3306)
+--192.168.60.209(192.168.60.209:3306)
Thu Aug 27 17:30:54 2020 - [info] Checking replication health on 192.168.60.208..
Thu Aug 27 17:30:54 2020 - [info] ok.
Thu Aug 27 17:30:54 2020 - [info] Checking replication health on 192.168.60.209..
Thu Aug 27 17:30:54 2020 - [info] ok.
Thu Aug 27 17:30:54 2020 - [info] Checking master_ip_failover_script status:
Thu Aug 27 17:30:54 2020 - [info] /etc/mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.60.206 --orig_master_ip=192.168.60.206 --orig_master_port=3306
Unknown option: orig_master_ip
Unknown option: orig_master_port
IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.60.220/22===
Checking the Status of the script.. OK
Thu Aug 27 17:30:54 2020 - [info] OK.
Thu Aug 27 17:30:54 2020 - [warning] shutdown_script is not defined.
Thu Aug 27 17:30:54 2020 - [info] Got exit code 0 (Not master dead).
MySQL Replication Health is OK.
注意:出現MySQL Replication Health is OK表明集羣狀態正常
七.啓動MHA
[root@slave2 ~]# nohup masterha_manager —conf=/etc/mha/app1.cnf —remove_dead_master_conf —ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &
[1] 17359
[root@slave2 ~]# ps -ef |grep mha
root 3186 1466 0 10:15 pts/0 00:00:30 perl /usr/bin/masterha_manager —conf=/etc/mha/app1.cnf —remove_dead_master_conf —ignore_last_failover
root 17492 2402 0 17:34 pts/1 00:00:00 grep —color=auto mha
查看狀態
[root@slave2 ~]# masterha_check_status —conf=/etc/mha/app1.cnf
app1 (pid:3186) is running(0:PING_OK), master:192.168.60.206
八.配置虛擬VIP
爲了防止腦裂發生,推薦生產環境採用腳本的方式來管理虛擬ip
vim /etc/mha/scripts/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.60.220/24'; #這裏的vip地址寫一個與本身IP地址相同的IP段
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; #這裏的網卡要看本身外網的外卡是否是ens33,不是的話要改爲本身的外網網卡
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"
;
}
sub stop_vip() {
return 0 unless ($ssh_user);ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"
;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
slave2節點添加虛擬IP
1.ifconfig eth0:1 192.168.60.220/24
2.[root@slave2 ~]# ifconfig
docker0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 172.17.0.1 netmask 255.255.0.0 broadcast 172.17.255.255
ether 02:42:4d:6c:22:65 txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 192.168.60.209 netmask 255.255.252.0 broadcast 192.168.63.255
inet6 fe80::98ac:404b:b616:a248 prefixlen 64 scopeid 0x20<link>
inet6 fe80::b41a:7e5c:3252:28ce prefixlen 64 scopeid 0x20<link>
inet6 fe80::6538:4b94:1249:2af8 prefixlen 64 scopeid 0x20<link>
ether 00:0c:29:4b:16:eb txqueuelen 1000 (Ethernet)
RX packets 115587 bytes 10828651 (10.3 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 36832 bytes 4253686 (4.0 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
九.測試MHA故障轉移
1.停掉msater主庫
[root@master ~]# systemctl stop mysqld
[root@master ~]# ss -ntlp |grep 3306
[root@master ~]# ps -ef |grep mysql
root 31366 6836 0 17:50 pts/0 00:00:00 grep —color=auto mysql
2.登陸slave2查看主從狀態
在slave2上查看slave的狀態,發現master_host變成slave1 208
[root@slave2 ~]# mysql -uroot -p123456 -e 'show slave status\G';
mysql: [Warning] Using a password on the command line interface can be insecure.
1. row
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.60.208
Master_User: repl
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: binlog.000011
Read_Master_Log_Pos: 1560
Relay_Log_File: slave2-relay-bin.000002
Relay_Log_Pos: 317
Relay_Master_Log_File: binlog.000011
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1560
Relay_Log_Space: 525
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 8a32d1ef-e6a3-11ea-8fcd-000c2949fb11
Master_Info_File: /var/lib/mysql/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
3.查看slave2的配置文件
[root@slave2 ~]# vim /etc/mha/app1.cnf
[server default]
manager_log=/var/log/mha/app1/manager.log
manager_workdir=/var/log/mha/app1
master_binlog_dir=/var/lib/mysql
master_ip_failover_script=/etc/mha/scripts/master_ip_failover
password=123456
ping_interval=2
repl_password=123456
repl_user=repl
ssh_user=root
user=mha
[server2]
hostname=192.168.60.208
port=3306
[server3]
hostname=192.168.60.209
ignore_fail=1
no_master=1
port=3306
當master上的主庫蕩機以後,MHA會自動檢查,發現主庫mysql停機,馬上會把從庫上提高爲主庫,而後另外一臺服務器會把mysql主從複製的master_host改成新提高的主庫,高可用做用達到了,當master出現故障時,它能夠自動將最新數據的slave提高爲新的master,而後將全部其餘的slave從新指向新的master。
4.查看MHA的日誌
[root@slave2 ~]# tail /var/log/mha/app1/manager.log
Invalidated master IP address on 192.168.60.206(192.168.60.206:3306)
The latest slave 192.168.60.208(192.168.60.208:3306) has all relay logs for recovery.
Selected 192.168.60.208(192.168.60.208:3306) as a new master.
192.168.60.208(192.168.60.208:3306): OK: Applying all logs succeeded.
Failed to activate master IP address for 192.168.60.208(192.168.60.208:3306) with return code 10:0
192.168.60.209(192.168.60.209:3306): This host has the latest relay log events.
Generating relay diff files from the latest slave succeeded.
192.168.60.209(192.168.60.209:3306): OK: Applying all logs succeeded. Slave started, replicating from 192.168.60.208(192.168.60.208:3306)
192.168.60.208(192.168.60.208:3306): Resetting slave info succeeded.
Master failover to 192.168.60.208(192.168.60.208:3306) completed successfully.
日誌顯示208成功切換爲master主庫
這個時候能夠開始修復,手動操做,將他恢復到以前的主從狀態
而後重啓MHA
nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/mha/app1/manager.log 2>&1 &