MySQL MHA
架構介紹:MHA由兩部分組成MHA Manager(管理節點)和MHA Node(數據節點),MHA Node運行在每臺MySQL服務器上,MHA Manager會定時探測集羣中的master節點,當master出現故障時,它能夠自動將最新數據的slave提高爲新的master,而後將全部其餘的slave從新指向新的master
MHA的隱患:在MHA自動故障切換的過程當中,MHA試圖從宕掉的主服務器上保存二進制日誌,最大程度保證數據的不丟失,存在的問題是,若是主服務器硬件故障宕機或沒法經過SSH訪問,MHA沒有辦法保存二進制日誌,只能進行故障轉移而可能丟失最新數據
工做原理總結爲如下幾條:
1.從宕機崩潰的master保存二進制日誌事件(binlog events);
2.識別含有最新更新的slave;
3.應用差別的中繼日誌(relay log) 到其餘slave;
4.應用從master保存的二進制日誌事件(binlog events);
5.提高一個slave爲新master;
6.使用其餘的slave鏈接新的master進行復制。
一、安裝mysql:
1.1 添加環境變量
vim /etc/profile
export PATH=$PATH:/usr/local/mysql/bin
source /etc/profile
1.2 解壓tar包
tar -xf mysql-5.7.22-linux-glibc2.12-x86_64.tar.gz
mv mysql-5.7.22-linux-glibc2.12-x86_64 mysql
scp -r /usr/local/mysql slave1:/usr/local/mysql
scp -r /usr/local/mysql slave2:/usr/local/mysql
scp /etc/my.cnf slave1:/etc/
scp /etc/my.cnf slave2:/etc/
全部節點my.cnf的server-id必須惟一
1.3 建立用戶,目錄,受權,初始化,啓動(3臺執行)
useradd mysql
mkdir -p /home/mysql3306/{mysql3306,logs}
chown mysql:mysql /home/mysql3306 -R
chown mysql:mysql /usr/local/mysql -R
mysqld --defaults-file=/etc/my.cnf --initialize-insecure --datadir=/home/mysql3306/mysql3306 --basedir=/usr/local/mysql --user=mysql
mysqld_safe --user=mysql &
二、配置主從
2.1 在master上創建賬戶並受權slave:
grant REPLICATION CLIENT,REPLICATION SLAVE on *.* to rep@'192.168.111.129' identified by '123456';
grant REPLICATION CLIENT,REPLICATION SLAVE on *.* to rep@'192.168.111.130' identified by '123456';
flush privileges;
2.2 查看master狀態,獲取binlog文件和pos點
mysql> show master status;
2.3 slave一、slave2設置須要同步的主庫
change master to master_host='192.168.111.128',master_user='rep',master_password='123456', master_log_file='mysql-bin.000002',master_log_pos=1229,MASTER_PORT=3306;
flush privileges;
start slave;
2.4 查看從服務器複製狀態
show slave status\G
2.5 兩臺slave服務器設置read_only(從庫對外提供讀服務,之因此沒有寫進配置文件,是由於隨時slave會提高爲master)
mysql -uroot -e "set global read_only=1"
2.6 全部節點建立manager所需的監控用戶
grant all privileges on *.* to 'rep'@'192.168.111.%' identified by '123456';
三、搭建MHA
3.1 配置集羣內時間同步、ssh免密碼登錄
3.2 MHA node節點安裝
yum install -y perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-DBD-MySQL perl-devel perl-CPAN
mkdir -p /etc/mha ##建立安裝目錄
tar -xf mha4mysql-node-0.57.tar.gz -C /etc/mha/
mv /etc/mha/mha4mysql-node-0.57 /etc/mha/node
cd /etc/mha/node
perl Makefile.PL
make && make install
3.3 MHA manager節點安裝
yum install -y perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
tar -xf mha4mysql-manager-0.57.tar.gz -C /etc/mha/
mv /etc/mha/mha4mysql-manager-0.57 /etc/mha/manager
cd /etc/mha/manager
perl Makefile.PL
make && make install
3.4 配置MHA
修改manager配置文件
mkdir /etc/mha/app1 ##建立manager工做目錄
cp /etc/mha/manager/samples/conf/app1.cnf /etc/mha/
vim /etc/mha/app1.cnf
[server default]
manager_workdir=/etc/mha/app1 #MHA工做路徑
manager_log=/etc/mha/app1/manager.log #MHA日誌路徑
master_binlog_dir="/home/mysql3306/mysql3306" #MHA node端的binlog路徑,也就是mysql的數據目錄
remote_workdir=/etc/mha/app1 #遠端mysql在發生切換時binlog的保存位置
master_ip_failover_script=/etc/mha/master_ip_failover #自動failover時候的切換腳本
master_ip_online_change_script=/etc/mha/master_ip_online_change #手動切換腳本
report_script=/etc/mha/send_report #發生切換後報警腳本
user=rep #監控用戶
password=123456 #監控用戶密碼
repl_user=rep #複製用戶
repl_password=123456 #複製用戶密碼
ping_interval=1 #MHA manager的檢測時間間隔(1秒)
secondary_check_script= masterha_secondary_check -s slave1 -s mastre --user=rep --master_host=master --master_ip=192.168.111.128 --master_port=3306 #MHA檢測到master出現問題,Manager會嘗試從slave1登錄到master
[server1]
hostname=192.168.111.128
port=3306
ssh_port=22
[server2]
hostname=192.168.111.129
port=3306
candidate_master=1 #備用主,若是主庫出問題,此庫將提高爲主庫,即便這個庫不是集羣中事件最新的slave
ssh_port=22
check_repl_delay=0 #默認狀況下,一個slave落後於master 100M的relay log,MHA將不會選擇該slave爲一個新的master,設置爲0,MHA觸發切換在選擇一個新的master的時候將會忽略複製延時,這個參數對於設置了candidate_master=1的主機很是有用,由於這個候選主在切換的過程當中必定是新的master
[server3]
hostname=192.168.111.130
port=3306
no_master=1
ssh_port=22
3.5 設置slave節點relay log清除方式;創建硬鏈接
MHA發生切換工程中,從庫恢復依賴於relay log,mysql默認狀況下,從庫應用完就會自動清除relay log,所以將其設置爲OFF,採用手動清理方式。
mysql -uroot -p123456 -e "set global relay_log_purge=0"
按期刪除relay log可能會出現複製延遲的問題,因此創建relay log日誌硬鏈接,由於linux系統中經過硬鏈接刪除大文件速度快。
mkdir /home/mysql3306/logs1
ln /home/mysql3306/logs/mysql-relay* /home/mysql3306/logs1
3.6 編寫按期清理relay log腳本,結合定時任務清理(slave一、slave2操做)
vim /etc/mha/purge_relay_log.sh
#!/bin/bash
user=root
passwd=123456
port=3306
log_dir='/home/mysql3306/logs/'
work_dir='/home/mysql3306/logs1'
purge='/usr/local/bin/purge_relay_logs'
if [ ! -d $log_dir ]
then
mkdir $log_dir -p
fi
$purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --host=localhost --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1
參數說明:
--work_dir:指定建立relay log的硬連接的位置
--disable_relay_log_purge :默認狀況下,若是relay_log_purge=1,腳本會什麼都不清理,自動退出,經過設定這個參數,當relay_log_purge=1的狀況下會將relay_log_purge設置爲0。清理relay log以後,最後將參數設置爲OFF。
此處有幾個小細節
purge_relay_logs腳本中定義了的sock文件位置/var/lib/mysql/mysql.sock,能夠作個軟鏈
ln -s /tmp/mysql3306.sock /var/lib/mysql/mysql.sock
purge_relay_logs須要--user=root --host=localhost 沒有權限的,須要設置
沒問題了,能夠先測試下:
purge_relay_logs --user=root --host=localhost --port=3306 --password=123456 -disable_relay_log_purge --workdir=/home/mysql3306/logs/
出現這個說明測試經過:2018-07-04 05:22:21: All relay log purging operations succeeded.
添加定時任務
crontab -e
0 0 */3 * * sh /etc/auto_clean_relay_log.sh
3.7 建立自動切換腳本
vim /etc/mha/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
my $vip = '192.168.111.111/24';
my $key = '0';
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
sub stop_vip() {
return 0 unless ($ssh_user);
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip
--orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
3.8 建立手動切換腳本
vim /etc/mha/master_ip_online_change
#!/usr/bin/env perl
use strict;
use warnings FATAL =>'all';
use Getopt::Long;
my $vip = '192.168.111.111/24'; # Virtual IP
my $key = "0";
my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";
my $exit_code = 0;
my (
$command, $orig_master_is_new_slave, $orig_master_host,
$orig_master_ip, $orig_master_port, $orig_master_user,
$orig_master_password, $orig_master_ssh_user, $new_master_host,
$new_master_ip, $new_master_port, $new_master_user,
$new_master_password, $new_master_ssh_user,
);
GetOptions(
'command=s' => \$command,
'orig_master_is_new_slave' => \$orig_master_is_new_slave,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'orig_master_user=s' => \$orig_master_user,
'orig_master_password=s' => \$orig_master_password,
'orig_master_ssh_user=s' => \$orig_master_ssh_user,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
'new_master_user=s' => \$new_master_user,
'new_master_password=s' => \$new_master_password,
'new_master_ssh_user=s' => \$new_master_ssh_user,
);
exit &main();
sub main {
#print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
# $orig_master_host, $orig_master_ip, $orig_master_port are passed.
# If you manage master ip address at global catalog database,
# invalidate orig_master_ip here.
my $exit_code = 1;
eval {
print "\n\n\n***************************************************************\n";
print "Disabling the VIP - $vip on old master: $orig_master_host\n";
print "***************************************************************\n\n\n\n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
# all arguments are passed.
# If you manage master ip address at global catalog database,
# activate new_master_ip here.
# You can also grant write access (create user, set read_only=0, etc) here.
my $exit_code = 10;
eval {
print "\n\n\n***************************************************************\n";
print "Enabling the VIP - $vip on new master: $new_master_host \n";
print "***************************************************************\n\n\n\n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_start_vip \"`;
exit 0;
}
else {
&usage();
exit 1;
}
}
# A simple system call that enable the VIP on the new master
sub start_vip() {
`ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
sub usage {
print
"Usage: master_ip_failover -command=start|stop|stopssh|status -orig_master_host=host -orig_master_ip=ip -
orig_master_port=po
rt -new_master_host=host -new_master_ip=ip -new_master_port=port\n";
}
3.9編寫切換節點監控報警腳本
vim /etc/mha/send_report
#!/usr/bin/perl
use strict;
use warnings FATAL => 'all';
use Mail::Sender;
use Getopt::Long;
#new_master_host and new_slave_hosts are set only when recovering master succeeded
my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body );
my $smtp='
smtp.163.com';
my $mail_from='
xxxxx@163.com';
my $mail_user='
xxxxx@163.com';
my $mail_pass='xxxxx';
my $mail_to=['
xxxxx@139.com'];
GetOptions(
'orig_master_host=s' => \$dead_master_host,
'new_master_host=s' => \$new_master_host,
'new_slave_hosts=s' => \$new_slave_hosts,
'subject=s' => \$subject,
'body=s' => \$body,
);
mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body);
sub mailToContacts {
my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_;
open my $DEBUG, "> /tmp/monitormail.log"
or die "Can't open the debug file:$!\n";
my $sender = new Mail::Sender {
ctype => 'text/plain; charset=utf-8',
encoding => 'utf-8',
smtp => $smtp,
from => $mail_from,
auth => 'LOGIN',
TLS_allowed => '0',
authid => $user,
authpwd => $passwd,
to => $mail_to,
subject => $subject,
debug => $DEBUG
};
$sender->MailMsg(
{ msg => $msg,
debug => $DEBUG
}
) or print $Mail::Sender::Error;
return 1;
}
# Do whatever you want here
exit 0;
腳本須要修改的地方
my $smtp='
smtp.163.com'; ## 提供smtp服務的服務商地址,一般爲smtp.(qq.163.139.)com
my $mail_from='
xxxxx@163.com'; ## 發送郵件的郵箱
my $mail_user='
xxxxx@163.com'; ## 同上
my $mail_pass='xxxxx'; ## 郵箱受權碼,郵箱開啓pop3/smtp時,通常會讓你設置密碼
my $mail_to=['
xxxxx@139.com']; ## 接收郵件的郵箱,139爲移動的短信郵箱,很方便,直接短信接收信息
給其執行權限
chmod +x /etc/mha/master_ip_failover
chmod +x /etc/mha/master_ip_online_change
chmod +x /etc/mha/send_report
3.10manager檢查ssh是否成功
/etc/mha/manager/bin/masterha_check_ssh --conf=/etc/mha/app1.cnf
3.11manager檢查複製狀態
全部節點建立軟鏈
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
ln -s /usr/local/mysql/bin/mysql /usr/bin/mysql
/etc/mha/manager/bin/masterha_check_repl --conf=/etc/mha/app1.cnf
3.12爲master添加vip
ifconfig ens33:0 192.168.111.111
3.13manager節點啓動mha
nohup /etc/mha/manager/bin/masterha_manager --conf=/etc/mha/app1.cnf --ignore_last_failover >/tmp/mha_manager.log < /dev/null 2>&1 &
3.14檢查mha狀態
/etc/mha/manager/bin/masterha_check_status --conf=/etc/mha/app1.cnf
3.15測試
實驗一:測試自動Failover
1.在slave1 上我先停掉IO線程,模擬主從延遲
stop slave io_thread;
2.master庫導入一張表(數據量儘可能大點,建議10W+以上數據)
這時候slave2一直在同步數據
3.slave1開啓IO線程
start slave io_thread;
4.停掉master mysql
實驗使用pkill mysql(生產禁用)
5.查看manager日誌,能夠看出master已經換了
tail -300f /etc/mha/app1/manager.log
6.在新的master上能夠看到落後的數據也已經同步過來了
7.查看Vip飄逸狀況,vip是否到了slave1這臺主機
實驗二:手動Failover測試
注意,執行手動Failover時,MHA manager必須沒有運行,不然,manager會掛掉
1.中止manager和master的mysql
/etc/mha/manager/bin/masterha_stop --conf=/etc/mha/app1.cnf
實驗使用pkill mysql(生產禁用)
2.執行manager上的腳本master_ip_online_change