MHA(Master High Availability)目前在mysql高可用方面比較成熟。是一套優秀的做爲 mysql高可用性環境下故障切換和主從提高的高可用軟件。在MySQL故障切換過程當中,MHA 能作到在0~30秒以內自動完成數據庫的故障切換操做,而且在進行故障切換的過程當中,MHA 能在最大程度上保證數據的一致性,以達到真正意義上的高可用。html
該軟件由兩部分組成:MHA Manager(管理節點)和MHA Node(數據節點)。管理節點能夠單獨部署在一臺獨立的機器上來管理多個master-slave集羣,也能夠部署在一臺slave節點上。數據節點運行在每臺mysql服務器上。Manager會按期檢查master,若出現故障時,會自動將最新數據的slave提高爲新的master,而後將其餘的slave指向新的master。整個故障轉移程序徹底透明。node
172.28.18.71做爲MHA管理節點,負責管理Mysql主從集羣,172.28.18.69爲Mysql主庫 ,172.28.18.78爲Mysql從庫,172.28.18.71也是mysql的一個從庫,172.28.18.70爲虛擬IPmysql
參照:https://www.cnblogs.com/sky-cheng/p/10955054.htmlgit
[root@localhost /]# cd [root@localhost ~]# pwd /root
進入root目錄,執行下面命令,生成ssh keygithub
[root@localhost ~]# ssh-keygen Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:wesmi4cYtji2OdL2cBpIpnEAITXtNmCg6MYvYyrKSFw root@localhost.localdomain The key's randomart image is: +---[RSA 2048]----+ |=+o. | |= o.. . | |+. o o | |o. + o | |.*.E . S | |*+= . | |oXo=... o | |@oO=...+ | |O*+.o.. | +----[SHA256]-----+
[root@localhost .ssh]# cp id_rsa.pub authorized_keys [root@localhost .ssh]# ll 總用量 12 -rw-r--r-- 1 root root 408 6月 4 14:06 authorized_keys -rw------- 1 root root 1679 6月 4 14:02 id_rsa -rw-r--r-- 1 root root 408 6月 4 14:02 id_rsa.pub
[root@localhost ~]# scp -P25601 -r /root/.ssh/ root@172.28.18.69:/root/ The authenticity of host '[172.28.18.69]:25601 ([172.28.18.69]:25601)' can't be established. ECDSA key fingerprint is SHA256:u5esiwOe7+3IGRBM9BOWYFMqe873DqimVeGBT2+nHdg. ECDSA key fingerprint is MD5:d3:05:0a:9d:92:46:7d:1c:ab:74:24:4b:cd:ae:81:b5. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[172.28.18.69]:25601' (ECDSA) to the list of known hosts. root@172.28.18.69's password: id_rsa 100% 1679 152.5KB/s 00:00 id_rsa.pub 100% 408 80.3KB/s 00:00 authorized_keys 100% 408 51.6KB/s 00:00 known_hosts 100% 182 35.6KB/s 00:00
[root@localhost ~]# scp -P25601 -r /root/.ssh/ root@172.28.18.78:/root/ The authenticity of host '[172.28.18.78]:25601 ([172.28.18.78]:25601)' can't be established. ECDSA key fingerprint is SHA256:zsVfyGQ5sza1SvWg/2wCqf4SVHMsLKXYkt4QlxE+sU4. ECDSA key fingerprint is MD5:78:13:37:ab:18:a9:b0:67:3d:0f:22:53:e6:ac:b5:62. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '[172.28.18.78]:25601' (ECDSA) to the list of known hosts. root@172.28.18.78's password: id_rsa 100% 1679 190.0KB/s 00:00 id_rsa.pub 100% 408 55.0KB/s 00:00 authorized_keys 100% 408 53.2KB/s 00:00 known_hosts 100% 364 378.8KB/s 00:00 [root@localhost ~]#
[root@localhost ~]# ssh 172.28.18.69 -p 25601 Last login: Tue Jun 4 14:13:32 2019 from 172.28.18.71 [root@server-1 ~]# exit 登出 Connection to 172.28.18.69 closed. [root@localhost ~]# ssh 172.28.18.78 -p 25601 Last login: Tue Jun 4 14:13:39 2019 from 172.28.18.71 [root@server-2 ~]# exit 登出 Connection to 172.28.18.78 closed. [root@localhost ~]#
[root@localhost ~]# mkdir /usr/local/src/mha4mysql-node [root@localhost ~]# cd /usr/local/src/mha4mysql-node
[root@localhost mha4mysql-node]# wget https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz
--2019-06-04 14:28:19-- https://github.com/yoshinorim/mha4mysql-node/releases/download/v0.58/mha4mysql-node-0.58.tar.gz
正在解析主機 github.com (github.com)... 13.250.177.223
正在鏈接 github.com (github.com)|13.250.177.223|:443... 已鏈接。
已發出 HTTP 請求,正在等待迴應... 302 Found
位置:https://github-production-release-asset-2e65be.s3.amazonaws.com/2093258/9d78fb60-2de4-11e8-8f0c-bac507a4e54f?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20190604%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20190604T062830Z&X-Amz-Expires=300&X-Amz-Signature=ed981fb367ad8bde9852881a5336f8af1a8927afcb797a6c9eddf41f5678fd75&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3B%20filename%3Dmha4mysql-node-0.58.tar.gz&response-content-type=application%2Foctet-stream [跟隨至新的 URL]
--2019-06-04 14:28:30-- https://github-production-release-asset-2e65be.s3.amazonaws.com/2093258/9d78fb60-2de4-11e8-8f0c-bac507a4e54f?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=AKIAIWNJYAX4CSVEH53A%2F20190604%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20190604T062830Z&X-Amz-Expires=300&X-Amz-Signature=ed981fb367ad8bde9852881a5336f8af1a8927afcb797a6c9eddf41f5678fd75&X-Amz-SignedHeaders=host&actor_id=0&response-content-disposition=attachment%3B%20filename%3Dmha4mysql-node-0.58.tar.gz&response-content-type=application%2Foctet-stream
正在解析主機 github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)... 52.216.18.168
正在鏈接 github-production-release-asset-2e65be.s3.amazonaws.com (github-production-release-asset-2e65be.s3.amazonaws.com)|52.216.18.168|:443... 已鏈接。
已發出 HTTP 請求,正在等待迴應... 200 OK
長度:56220 (55K) [application/octet-stream]
正在保存至: 「mha4mysql-node-0.58.tar.gz」sql
100%[===========================================================================>] 56,220 47.8KB/s 用時 1.1s數據庫
2019-06-04 14:28:32 (47.8 KB/s) - 已保存 「mha4mysql-node-0.58.tar.gz」 [56220/56220])json
[root@localhost mha4mysql-node]# yum install perl-DBD-MySQL -y [root@localhost mha4mysql-node]# yum install perl-DBI -y [root@localhost mha4mysql-node]# yum install mysql-libs -y
[root@localhost mha4mysql-node]# tar -zxvf mha4mysql-node-0.58.tar.gz drwxr-xr-x 2 zabbix zabbix 49 3月 23 2018 t [root@localhost mha4mysql-node-0.58]# perl Makefile.PL Can't locate ExtUtils/MakeMaker.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/Install/Can.pm line 5.
報錯,不能定位ExtUtils,怎安裝ExtUtils-MakeMaker
vim
[root@localhost src]# mkdir /usr/local/src/ExtUtils-MakeMaker [root@localhost src]# cd ExtUtils-MakeMaker/ [root@localhost ExtUtils-MakeMaker]# wget http://files.directadmin.com/services/9.0/ExtUtils-MakeMaker-6.31.tar.gz
[root@localhost ExtUtils-MakeMaker]# tar -zxvf ExtUtils-MakeMaker-6.31.tar.gz [root@localhost ExtUtils-MakeMaker]# cd ExtUtils-MakeMaker-6.31 [root@localhost ExtUtils-MakeMaker-6.31]# perl Makefile.PL Checking if your kit is complete... Looks good Could not open '': 沒有那個文件或目錄 at lib/ExtUtils/MM_Unix.pm line 2697.
報錯,則安裝perl-ExtUtils-MakeMaker
bash
[root@localhost ExtUtils-MakeMaker-6.31]# yum install perl-ExtUtils-MakeMaker
再次編譯
[root@localhost ExtUtils-MakeMaker-6.31]# perl Makefile.PL Checking if your kit is complete... Looks good Writing Makefile for ExtUtils::MakeMaker [root@localhost ExtUtils-MakeMaker-6.31]#
成功,繼續make make install
[root@localhost ExtUtils-MakeMaker-6.31]# make && make install
ExtUtils-MakeMaker安裝成功後,再次編譯mha4mysql-node
[root@localhost mha4mysql-node-0.58]# cd /usr/local/src/mha4mysql-node/mha4mysql-node-0.58 [root@localhost mha4mysql-node-0.58]# perl Makefile.PL *** Module::AutoInstall version 1.06 *** Checking for Perl dependencies... Can't locate CPAN.pm in @INC (@INC contains: inc /usr/local/lib64/perl5 /usr/local/share/perl5 /usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl /usr/lib64/perl5 /usr/share/perl5 .) at inc/Module/AutoInstall.pm line 304.
提示找不到CPAN.pm,則安裝CPAN模塊
[root@localhost mha4mysql-node-0.58]# mkdir /usr/local/src/CPAN [root@localhost mha4mysql-node-0.58]# cd /usr/local/src/CPAN/ [root@localhost CPAN]# wget http://search.cpan.org/CPAN/authors/id/A/AN/ANDK/CPAN-1.9205.tar.gz [root@localhost CPAN]# tar -zxvf CPAN-1.9205.tar.gz [root@localhost CPAN]# cd CPAN-1.9205 [root@localhost CPAN-1.9205]# perl Makefile.PL Importing PAUSE public key into your GnuPG keychain... gpg: 新的配置文件‘/root/.gnupg/gpg.conf’已創建 gpg: 警告:在‘/root/.gnupg/gpg.conf’裏的選項於這次運行期間未被使用 done! (You may wish to trust it locally with 'gpg --lsign-key 450F89EC') WARNING: EXTRA_META is not a known parameter. Checking if your kit is complete... Looks good Warning: prerequisite Test::More 0 not found. 'EXTRA_META' is not a known MakeMaker parameter name. Writing Makefile for CPAN [root@localhost CPAN-1.9205]# make && make install
再次編譯mha4mysql-node
[root@server-1 CPAN-1.9205]# cd /usr/local/src/mha4mysql-node/mha4mysql-node-0.58
[root@localhost mha4mysql-node-0.58]# perl Makefile.PL *** Module::AutoInstall version 1.06 *** Checking for Perl dependencies... [Core Features] - DBI ...loaded. (1.627) - DBD::mysql ...loaded. (4.023) *** Module::AutoInstall configuration finished. Checking if your kit is complete... Looks good Writing Makefile for mha4mysql::node [root@localhost mha4mysql-node-0.58]#
成功,繼續make && make install
[root@localhost mha4mysql-node-0.58]# make && make install
另外兩臺服務器也一樣安裝mha4mysql-node
[root@localhost mha4mysql-node-0.58] yum install -y perl-Mail-Sender perl-Email-Date-Format perl-MIME-Types perl-MIME-Lite perl-Parallel-ForkManager perl-Mail-Sendmail
[root@localhost mha4mysql-node-0.58] yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-YAML-Tiny perl-PAR-Dist perl-Module-ScanDeps perl-Module-CoreList perl-Module-Build perl-CPAN perl-YAML perl-CPANPLUS perl-File-Remove perl-Module-Install
[root@localhost masterha]# cpan Module::Install
[root@localhost src]# perl -MCPAN -e "install Class::Load"
[root@localhost mha4mysql-node-0.58]# mkdir /usr/local/src/mha4mysql-manager [root@localhost mha4mysql-node-0.58]# cd /usr/local/src/mha4mysql-manager [root@localhost mha4mysql-manager]# wget https://github.com/yoshinorim/mha4mysql-manager/releases/download/v0.58/mha4mysql-manager-0.58.tar.gz [root@localhost mha4mysql-manager]# tar -zxvf mha4mysql-manager-0.58.tar.gz [root@localhost mha4mysql-manager]# cd mha4mysql-manager-0.58
[root@localhost mha4mysql-manager-0.58]# perl Makefile.PL
*** Module::AutoInstall version 1.06
*** Checking for Perl dependencies...
[Core Features]
- DBI ...loaded. (1.627)
- DBD::mysql ...loaded. (4.023)
- Time::HiRes ...loaded. (1.9725)
- Config::Tiny ...loaded. (2.14)
- Log::Dispatch ...loaded. (2.41)
- Parallel::ForkManager ...loaded. (2.02)
- MHA::NodeConst ...loaded. (0.58)
*** Module::AutoInstall configuration finished.
Writing Makefile for mha4mysql::manager
Writing MYMETA.yml and MYMETA.json
[root@localhost mha4mysql-manager-0.58]#
make && make install
[root@server-1 mha4mysql-node-0.58]# make && make install
查看結果
[root@localhost bin]# ll /usr/local/bin/masterha_* -r-xr-xr-x 1 root root 1995 6月 4 16:26 /usr/local/bin/masterha_check_repl -r-xr-xr-x 1 root root 1779 6月 4 16:26 /usr/local/bin/masterha_check_ssh -r-xr-xr-x 1 root root 1865 6月 4 16:26 /usr/local/bin/masterha_check_status -r-xr-xr-x 1 root root 3201 6月 4 16:26 /usr/local/bin/masterha_conf_host -r-xr-xr-x 1 root root 2517 6月 4 16:26 /usr/local/bin/masterha_manager -r-xr-xr-x 1 root root 2165 6月 4 16:26 /usr/local/bin/masterha_master_monitor -r-xr-xr-x 1 root root 2373 6月 4 16:26 /usr/local/bin/masterha_master_switch -r-xr-xr-x 1 root root 5172 6月 4 16:26 /usr/local/bin/masterha_secondary_check -r-xr-xr-x 1 root root 1739 6月 4 16:26 /usr/local/bin/masterha_stop
masterha_check_ssh 檢查MHA的SSH配置情況
masterha_check_repl 檢查MySQL複製情況
masterha_manger 啓動MHA
masterha_check_status 檢測當前MHA運行狀態
masterha_master_monitor 檢測master是否宕機
masterha_master_switch 控制故障轉移(自動或者手動)
masterha_conf_host 添加或刪除配置的server信息
[root@localhost ~]# mkdir /etc/masterha [root@localhost ~]# cd /etc/masterha/ [root@localhost masterha]# vim app1.cnf [server default] manager_workdir=/etc/mha manager_log=/etc/mha/manager.log
#mysql用戶和密碼 password=Zaq1xsw@ user=root
#監控主庫,發送ping包的時間間隔,默認是3秒,嘗試3次不成功,則自動進行切換操做 ping_interval=1
#複製用戶 repl_password=Zaq1xsw@ repl_user=repl #report_script=/usr/local/send_report
#經過第三方機器確認目標主庫是否存活,不是必須的,就算沒有也是能用
#secondary_check_script=masterha_secondary_check -s 172.28.18.71 -s 172.28.18.69 -s 172.28.18.78
#故障自動切換VIP調用腳本,不是必須的,就算沒有也是能用,
master_ip_failover_script=/etc/masterha/scripts/master_ip_failover
#ssh用戶 ssh_user=root ssh_port=25601
[server1]
hostname=172.28.18.71 candidate_master=1
[server2]
hostname=172.28.18.69
candidate_master=1
[server3]
hostname=172.28.18.78
candidate_master=1
root@localhost ~]
[root@localhost mysql-5.7.26]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Thu Jun 6 14:11:16 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Jun 6 14:11:16 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Jun 6 14:11:16 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Jun 6 14:11:16 2019 - [info] Starting SSH connection tests.. Thu Jun 6 14:11:19 2019 - [debug] Thu Jun 6 14:11:16 2019 - [debug] Connecting via SSH from root@172.28.18.71(172.28.18.71:25601) to root@172.28.18.69(172.28.18.69:25601).. Thu Jun 6 14:11:18 2019 - [debug] ok. Thu Jun 6 14:11:18 2019 - [debug] Connecting via SSH from root@172.28.18.71(172.28.18.71:25601) to root@172.28.18.78(172.28.18.78:25601).. Thu Jun 6 14:11:18 2019 - [debug] ok. ^[[AThu Jun 6 14:11:29 2019 - [debug] Thu Jun 6 14:11:17 2019 - [debug] Connecting via SSH from root@172.28.18.78(172.28.18.78:25601) to root@172.28.18.71(172.28.18.71:25601).. Thu Jun 6 14:11:23 2019 - [debug] ok. Thu Jun 6 14:11:23 2019 - [debug] Connecting via SSH from root@172.28.18.78(172.28.18.78:25601) to root@172.28.18.69(172.28.18.69:25601).. Thu Jun 6 14:11:28 2019 - [debug] ok. Thu Jun 6 14:11:29 2019 - [debug] Thu Jun 6 14:11:17 2019 - [debug] Connecting via SSH from root@172.28.18.69(172.28.18.69:25601) to root@172.28.18.71(172.28.18.71:25601).. Thu Jun 6 14:11:22 2019 - [debug] ok. Thu Jun 6 14:11:22 2019 - [debug] Connecting via SSH from root@172.28.18.69(172.28.18.69:25601) to root@172.28.18.78(172.28.18.78:25601).. Thu Jun 6 14:11:28 2019 - [debug] ok. Thu Jun 6 14:11:29 2019 - [info] All SSH connection tests passed successfully.
測試成功
[root@localhost mysql-5.7.26]# masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Jun 6 13:54:14 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Jun 6 13:54:14 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Jun 6 13:54:14 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Jun 6 13:54:14 2019 - [info] MHA::MasterMonitor version 0.58. Thu Jun 6 13:54:15 2019 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted. Compilation failed in require at (eval 55) line 3. at /usr/local/share/perl5/MHA/DBHelper.pm line 208. at /usr/local/share/perl5/MHA/Server.pm line 166. Thu Jun 6 13:54:15 2019 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln301] install_driver(mysql) failed: Attempt to reload DBD/mysql.pm aborted. Compilation failed in require at (eval 55) line 3. at /usr/local/share/perl5/MHA/DBHelper.pm line 208. at /usr/local/share/perl5/MHA/Server.pm line 166. Thu Jun 6 13:54:16 2019 - [error][/usr/local/share/perl5/MHA/ServerManager.pm, ln309] Got fatal error, stopping operations Thu Jun 6 13:54:16 2019 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln427] Error happened on checking configurations. at /usr/local/share/perl5/MHA/MasterMonitor.pm line 329. Thu Jun 6 13:54:16 2019 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln525] Error happened on monitoring servers. Thu Jun 6 13:54:16 2019 - [info] Got exit code 1 (Not master dead). MySQL Replication Health is NOT OK!
報錯,此時檢查是否安裝了perl-DBD-mysql
[root@localhost mysql-5.7.26]# yum install perl-DBD-MySQL 已加載插件:fastestmirror Loading mirror speeds from cached hostfile * base: mirrors.huaweicloud.com * epel: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com 正在解決依賴關係 --> 正在檢查事務 ---> 軟件包 perl-DBD-MySQL.x86_64.0.4.023-6.el7 將被 安裝 --> 正在處理依賴關係 libmysqlclient.so.18(libmysqlclient_18)(64bit),它被軟件包 perl-DBD-MySQL-4.023-6.el7.x86_64 須要 --> 正在處理依賴關係 libmysqlclient.so.18()(64bit),它被軟件包 perl-DBD-MySQL-4.023-6.el7.x86_64 須要 --> 正在檢查事務 ---> 軟件包 mariadb-libs.x86_64.1.5.5.60-1.el7_5 將被 安裝 Removing mariadb-libs.x86_64 1:5.5.60-1.el7_5 - u due to obsoletes from installed mysql-community-libs-5.7.26-1.el7.x86_64 base/7/x86_64/filelists_db | 7.1 MB 00:00:03 --> 正在使用新的信息從新解決依賴關係 --> 正在檢查事務 ---> 軟件包 mariadb-libs.x86_64.1.5.5.60-1.el7_5 將被 安裝 --> 正在處理依賴關係 libmysqlclient.so.18(libmysqlclient_18)(64bit),它被軟件包 perl-DBD-MySQL-4.023-6.el7.x86_64 須要 --> 正在處理依賴關係 libmysqlclient.so.18(libmysqlclient_18)(64bit),它被軟件包 2:postfix-2.10.1-7.el7.x86_64 須要 --> 正在處理依賴關係 libmysqlclient.so.18()(64bit),它被軟件包 perl-DBD-MySQL-4.023-6.el7.x86_64 須要 --> 正在處理依賴關係 libmysqlclient.so.18()(64bit),它被軟件包 2:postfix-2.10.1-7.el7.x86_64 須要 --> 解決依賴關係完成 錯誤:軟件包:perl-DBD-MySQL-4.023-6.el7.x86_64 (base) 須要:libmysqlclient.so.18(libmysqlclient_18)(64bit) 錯誤:軟件包:2:postfix-2.10.1-7.el7.x86_64 (@anaconda) 須要:libmysqlclient.so.18(libmysqlclient_18)(64bit) 錯誤:軟件包:2:postfix-2.10.1-7.el7.x86_64 (@anaconda) 須要:libmysqlclient.so.18()(64bit) 錯誤:軟件包:perl-DBD-MySQL-4.023-6.el7.x86_64 (base) 須要:libmysqlclient.so.18()(64bit) 您能夠嘗試添加 --skip-broken 選項來解決該問題 ** 發現 2 個已存在的 RPM 數據庫問題, 'yum check' 輸出以下: 2:postfix-2.10.1-7.el7.x86_64 有缺乏的需求 libmysqlclient.so.18()(64bit) 2:postfix-2.10.1-7.el7.x86_64 有缺乏的需求 libmysqlclient.so.18(libmysqlclient_18)(64bit) [root@localhost mysql-5.7.26]# rpm -ivh mysql-community-libs-compat-5.7.18-1.el7.x86_64.rpm 錯誤:打開 mysql-community-libs-compat-5.7.18-1.el7.x86_64.rpm 失敗: 沒有那個文件或目錄 [root@localhost mysql-5.7.26]# ll
安裝perl-DBD-MySQL報錯沒有找到libmysqlclient.so.18()(64bit),此時須要安裝mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm包
[root@localhost mysql-5.7.26]# wget https://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm --2019-06-06 14:00:09-- https://dev.mysql.com/get/Downloads/MySQL-5.7/mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm 正在解析主機 dev.mysql.com (dev.mysql.com)... 137.254.60.11 正在鏈接 dev.mysql.com (dev.mysql.com)|137.254.60.11|:443... 已鏈接。 已發出 HTTP 請求,正在等待迴應... 302 Found 位置:https://cdn.mysql.com//Downloads/MySQL-5.7/mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm [跟隨至新的 URL] --2019-06-06 14:00:15-- https://cdn.mysql.com//Downloads/MySQL-5.7/mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm 正在解析主機 cdn.mysql.com (cdn.mysql.com)... 23.41.87.110 正在鏈接 cdn.mysql.com (cdn.mysql.com)|23.41.87.110|:443... 已鏈接。 已發出 HTTP 請求,正在等待迴應... 200 OK 長度:2118444 (2.0M) [application/x-redhat-package-manager] 正在保存至: 「mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm」 100%[===================================================================================>] 2,118,444 11.4KB/s 用時 2m 50s 2019-06-06 14:03:09 (12.1 KB/s) - 已保存 「mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm」 [2118444/2118444]) [root@localhost mysql-5.7.26]# rpm -ivh mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm 警告:mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm: 頭V3 DSA/SHA1 Signature, 密鑰 ID 5072e1f5: NOKEY 準備中... ################################# [100%] 正在升級/安裝... 1:mysql-community-libs-compat-5.7.2################################# [100%]
再次安裝perl-DBD-MySQL
[root@localhost mysql-5.7.26]# yum install perl-DBD-MySQL 已加載插件:fastestmirror Loading mirror speeds from cached hostfile * base: mirrors.huaweicloud.com * epel: mirrors.aliyun.com * extras: mirrors.aliyun.com * updates: mirrors.aliyun.com 正在解決依賴關係 --> 正在檢查事務 ---> 軟件包 perl-DBD-MySQL.x86_64.0.4.023-6.el7 將被 安裝 --> 解決依賴關係完成 依賴關係解決 ============================================================================================================================= Package 架構 版本 源 大小 ============================================================================================================================= 正在安裝: perl-DBD-MySQL x86_64 4.023-6.el7 base 140 k 事務概要 ============================================================================================================================= 安裝 1 軟件包 總下載量:140 k 安裝大小:323 k Is this ok [y/d/N]: y Downloading packages: perl-DBD-MySQL-4.023-6.el7.x86_64.rpm | 140 kB 00:00:00 Running transaction check Running transaction test Transaction test succeeded Running transaction 警告:RPM 數據庫已被非 yum 程序修改。 正在安裝 : perl-DBD-MySQL-4.023-6.el7.x86_64 1/1 驗證中 : perl-DBD-MySQL-4.023-6.el7.x86_64 1/1 已安裝: perl-DBD-MySQL.x86_64 0:4.023-6.el7 完畢!
再次測試repl
[root@localhost mysql-5.7.26]# masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Jun 6 14:13:05 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Jun 6 14:13:05 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Jun 6 14:13:05 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Jun 6 14:13:05 2019 - [info] MHA::MasterMonitor version 0.58. Thu Jun 6 14:13:06 2019 - [info] GTID failover mode = 1 Thu Jun 6 14:13:06 2019 - [info] Dead Servers: Thu Jun 6 14:13:06 2019 - [info] Alive Servers: Thu Jun 6 14:13:06 2019 - [info] 172.28.18.71(172.28.18.71:3306) Thu Jun 6 14:13:06 2019 - [info] 172.28.18.69(172.28.18.69:3306) Thu Jun 6 14:13:06 2019 - [info] 172.28.18.78(172.28.18.78:3306) Thu Jun 6 14:13:06 2019 - [info] Alive Slaves: Thu Jun 6 14:13:06 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Thu Jun 6 14:13:06 2019 - [info] GTID ON Thu Jun 6 14:13:06 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Thu Jun 6 14:13:06 2019 - [info] Primary candidate for the new Master (candidate_master is set) Thu Jun 6 14:13:06 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Thu Jun 6 14:13:06 2019 - [info] GTID ON Thu Jun 6 14:13:06 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Thu Jun 6 14:13:06 2019 - [info] Primary candidate for the new Master (candidate_master is set) Thu Jun 6 14:13:06 2019 - [info] Current Alive Master: 172.28.18.69(172.28.18.69:3306) Thu Jun 6 14:13:06 2019 - [info] Checking slave configurations.. Thu Jun 6 14:13:06 2019 - [info] read_only=1 is not set on slave 172.28.18.71(172.28.18.71:3306). Thu Jun 6 14:13:06 2019 - [info] read_only=1 is not set on slave 172.28.18.78(172.28.18.78:3306). Thu Jun 6 14:13:06 2019 - [info] Checking replication filtering settings.. Thu Jun 6 14:13:06 2019 - [info] binlog_do_db= , binlog_ignore_db= Thu Jun 6 14:13:06 2019 - [info] Replication filtering check ok. Thu Jun 6 14:13:06 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Thu Jun 6 14:13:06 2019 - [info] Checking SSH publickey authentication settings on the current master.. Thu Jun 6 14:13:11 2019 - [warning] HealthCheck: Got timeout on checking SSH connection to 172.28.18.69! at /usr/local/share/perl5/MHA/HealthCheck.pm line 343. Thu Jun 6 14:13:11 2019 - [info] 172.28.18.69(172.28.18.69:3306) (current master) +--172.28.18.71(172.28.18.71:3306) +--172.28.18.78(172.28.18.78:3306) Thu Jun 6 14:13:11 2019 - [info] Checking replication health on 172.28.18.71.. Thu Jun 6 14:13:11 2019 - [info] ok. Thu Jun 6 14:13:11 2019 - [info] Checking replication health on 172.28.18.78.. Thu Jun 6 14:13:11 2019 - [info] ok. Thu Jun 6 14:13:11 2019 - [warning] master_ip_failover_script is not defined. Thu Jun 6 14:13:11 2019 - [warning] shutdown_script is not defined. Thu Jun 6 14:13:11 2019 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
測試成功,信息顯示:
Alive Slaves: 172.28.18.78 172.28.18.71 ---- 活動從庫爲172.28.18.78,
GTID ON--GTID複製模式開啓
Replicating from 172.28.18.69 ----從172.28.18.69主庫複製
Current Alive Master: 172.28.18.69(172.28.18.69:3306) --當前活動主庫爲172.28.18.69
[root@localhost bin]# masterha_manager --conf=/etc/masterha/app1.cnf & [1] 12822 [root@localhost bin]# Thu Jun 6 10:25:43 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Jun 6 10:25:43 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Jun 6 10:25:43 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf.. ^C [root@localhost bin]#
查看進程
[root@localhost ~]# ps -ef|grep masterha root 12822 10037 2 10:25 pts/3 00:00:00 perl /usr/local/bin/masterha_manager --conf=/etc/masterha/app1.cnf root 12863 31965 0 10:25 pts/2 00:00:00 grep --color=auto masterha
[root@localhost bin]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:12822) is running(0:PING_OK), master:172.28.18.69
[root@localhost bin]# masterha_stop --conf=/etc/masterha/app1.cnf Stopped app1 successfully. [1]+ 退出 1 masterha_manager --conf=/etc/masterha/app1.cnf
[root@localhost /]# mkdir /etc/masterha/scripts [root@localhost /]# cd /etc/masterha/scripts/[root@localhost scripts]# vim master_ip_failover
#!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); my $vip = '172.28.18.70/24'; # Virtual IP my $key = "1"; my $int = "em1"; my $ssh_start_vip = "/user/sbin/ifconfig $int:$key $vip"; my $ssh_stop_vip = "/user/sbin/ifconfig $int:$key down"; $ssh_user = "root"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; #`ssh $ssh_user\@cluster1 \" $ssh_start_vip \"`; &status(); exit 0; } else { &usage(); exit 1; } } # A simple system call that enable the VIP on the new master sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; `ssh $ssh_user\@$new_master_host \" $arp_effect \"`; # `ssh $ssh_user\@$new_master_host \" $test \"`; } # A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub status() { print `ssh $ssh_user\@$orig_master_host \" ip add show $int \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
[root@localhost scripts]# ./master_ip_failover --command=start --new_master_host=172.28.18.71 IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Enabling the VIP - 172.28.18.70/24 on the new master - 172.28.18.71
查看VIP
[root@localhost scripts]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 84:2b:2b:5c:dc:15 brd ff:ff:ff:ff:ff:ff inet 172.28.18.71/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet 172.28.18.70/24 brd 172.28.18.255 scope global em1:1 valid_lft forever preferred_lft forever inet6 fe80::e0b8:7d61:e043:692/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:17 brd ff:ff:ff:ff:ff:ff 4: em3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:19 brd ff:ff:ff:ff:ff:ff 5: em4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:1b brd ff:ff:ff:ff:ff:ff 6: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0
設置成功,
[root@localhost scripts]# ./master_ip_failover --command=stop --orig_master_host=172.28.18.71 IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Disabling the VIP on old master: 172.28.18.71
查看結果
[root@localhost scripts]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 84:2b:2b:5c:dc:15 brd ff:ff:ff:ff:ff:ff inet 172.28.18.71/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet6 fe80::e0b8:7d61:e043:692/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:17 brd ff:ff:ff:ff:ff:ff 4: em3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:19 brd ff:ff:ff:ff:ff:ff 5: em4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:1b brd ff:ff:ff:ff:ff:ff 6: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0
172.28.18.70的VIP已經被移除
[root@localhost scripts]# ./master_ip_failover --command=status --orig_master_host=172.28.18.71 IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Checking the Status of the script.. OK 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 84:2b:2b:5c:dc:15 brd ff:ff:ff:ff:ff:ff inet 172.28.18.71/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet 172.28.18.70/24 brd 172.28.18.255 scope global em1:1 valid_lft forever preferred_lft forever inet6 fe80::e0b8:7d61:e043:692/64 scope link noprefixroute valid_lft forever preferred_lft forever
到此,整個環境搭建完畢,下一步測試故障轉移
[root@localhost ~]# tail -f /etc/masterha/manager.log
[root@localhost ~]# masterha_manager --conf=/etc/masterha/app1.cnf & [1] 28459 [root@localhost ~]# Mon Jun 10 16:05:12 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Jun 10 16:05:12 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Mon Jun 10 16:05:12 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Mon Jun 10 16:05:12 2019 - [info] MHA::MasterMonitor version 0.58. Mon Jun 10 16:05:14 2019 - [info] GTID failover mode = 1 Mon Jun 10 16:05:14 2019 - [info] Dead Servers: Mon Jun 10 16:05:14 2019 - [info] Alive Servers: Mon Jun 10 16:05:14 2019 - [info] 172.28.18.71(172.28.18.71:3306) Mon Jun 10 16:05:14 2019 - [info] 172.28.18.69(172.28.18.69:3306) Mon Jun 10 16:05:14 2019 - [info] 172.28.18.78(172.28.18.78:3306) Mon Jun 10 16:05:14 2019 - [info] Alive Slaves: Mon Jun 10 16:05:14 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Mon Jun 10 16:05:14 2019 - [info] GTID ON Mon Jun 10 16:05:14 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Mon Jun 10 16:05:14 2019 - [info] Primary candidate for the new Master (candidate_master is set) Mon Jun 10 16:05:14 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Mon Jun 10 16:05:14 2019 - [info] GTID ON Mon Jun 10 16:05:14 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Mon Jun 10 16:05:14 2019 - [info] Primary candidate for the new Master (candidate_master is set) Mon Jun 10 16:05:14 2019 - [info] Current Alive Master: 172.28.18.69(172.28.18.69:3306) Mon Jun 10 16:05:14 2019 - [info] Checking slave configurations.. Mon Jun 10 16:05:14 2019 - [info] read_only=1 is not set on slave 172.28.18.71(172.28.18.71:3306). Mon Jun 10 16:05:14 2019 - [info] Checking replication filtering settings.. Mon Jun 10 16:05:14 2019 - [info] binlog_do_db= , binlog_ignore_db= Mon Jun 10 16:05:14 2019 - [info] Replication filtering check ok. Mon Jun 10 16:05:14 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Mon Jun 10 16:05:14 2019 - [info] Checking SSH publickey authentication settings on the current master.. Mon Jun 10 16:05:14 2019 - [info] HealthCheck: SSH to 172.28.18.69 is reachable. Mon Jun 10 16:05:14 2019 - [info] 172.28.18.69(172.28.18.69:3306) (current master) +--172.28.18.71(172.28.18.71:3306) +--172.28.18.78(172.28.18.78:3306) Mon Jun 10 16:05:14 2019 - [info] Checking master_ip_failover_script status: Mon Jun 10 16:05:14 2019 - [info] /etc/masterha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.28.18.69 --orig_master_ip=172.28.18.69 --orig_master_port=3306 --orig_master_ssh_port=25601 Unknown option: orig_master_ssh_port IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Checking the Status of the script.. OK 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 14:fe:b5:dc:2c:77 brd ff:ff:ff:ff:ff:ff inet 172.28.18.69/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet 172.28.18.70/32 scope global em1 valid_lft forever preferred_lft forever inet6 fe80::b3e8:e3b2:2242:a2ed/64 scope link noprefixroute valid_lft forever preferred_lft forever Mon Jun 10 16:05:14 2019 - [info] OK. Mon Jun 10 16:05:14 2019 - [warning] shutdown_script is not defined. Mon Jun 10 16:05:14 2019 - [info] Set master ping interval 1 seconds. Mon Jun 10 16:05:14 2019 - [info] Set secondary check script: masterha_secondary_check -s 172.28.18.71 -s 172.28.18.69 -s 172.28.18.78 Mon Jun 10 16:05:14 2019 - [info] Starting ping health check on 172.28.18.69(172.28.18.69:3306).. Mon Jun 10 16:05:14 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
日誌中能夠看出
172.28.18.69(172.28.18.69:3306) (current master) +--172.28.18.71(172.28.18.71:3306) +--172.28.18.78(172.28.18.78:3306)
172.28.18.69爲主庫, 172.28.18.71 ,172.28.18.78爲從庫
2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 14:fe:b5:dc:2c:77 brd ff:ff:ff:ff:ff:ff inet 172.28.18.69/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet 172.28.18.70/32 scope global em1 valid_lft forever preferred_lft forever inet6 fe80::b3e8:e3b2:2242:a2ed/64 scope link noprefixroute valid_lft forever preferred_lft forever172.28.18.69上已經設置了172.28.18.70的VIP
[root@server-1 ~]# killall mysqld [root@server-1 ~]# ps -ef|grep mysqld root 8188 29237 0 17:05 pts/0 00:00:00 grep --color=auto mysqld
Mon Jun 10 17:26:52 2019 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query) Mon Jun 10 17:26:52 2019 - [info] Executing secondary network check script: masterha_secondary_check -s 172.28.18.71 -s 172.28.18.69 -s 172.28.18.78 --user=root --master_host=172.28.18.69 --master_ip=172.28.18.69 --master_port=3306 --master_user=root --master_password=Zaq1xsw@ --ping_type=SELECT Mon Jun 10 17:26:52 2019 - [info] Executing SSH check script: exit 0 Mon Jun 10 17:26:52 2019 - [info] HealthCheck: SSH to 172.28.18.69 is reachable. Monitoring server 172.28.18.71 is reachable, Master is not reachable from 172.28.18.71. OK. Monitoring server 172.28.18.69 is reachable, Master is not reachable from 172.28.18.69. OK. Mon Jun 10 17:26:53 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.28.18.69' (111)) Mon Jun 10 17:26:53 2019 - [warning] Connection failed 2 time(s).. Monitoring server 172.28.18.78 is reachable, Master is not reachable from 172.28.18.78. OK. Mon Jun 10 17:26:53 2019 - [info] Master is not reachable from all other monitoring servers. Failover should start. Mon Jun 10 17:26:54 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.28.18.69' (111)) Mon Jun 10 17:26:54 2019 - [warning] Connection failed 3 time(s).. Mon Jun 10 17:26:55 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.28.18.69' (111)) Mon Jun 10 17:26:55 2019 - [warning] Connection failed 4 time(s).. Mon Jun 10 17:26:55 2019 - [warning] Master is not reachable from health checker! Mon Jun 10 17:26:55 2019 - [warning] Master 172.28.18.69(172.28.18.69:3306) is not reachable! Mon Jun 10 17:26:55 2019 - [warning] SSH is reachable. Mon Jun 10 17:26:55 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status.. Mon Jun 10 17:26:55 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Jun 10 17:26:55 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Mon Jun 10 17:26:55 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Mon Jun 10 17:26:56 2019 - [info] GTID failover mode = 1 Mon Jun 10 17:26:56 2019 - [info] Dead Servers: Mon Jun 10 17:26:56 2019 - [info] 172.28.18.69(172.28.18.69:3306) Mon Jun 10 17:26:56 2019 - [info] Alive Servers: Mon Jun 10 17:26:56 2019 - [info] 172.28.18.71(172.28.18.71:3306) Mon Jun 10 17:26:56 2019 - [info] 172.28.18.78(172.28.18.78:3306) Mon Jun 10 17:26:56 2019 - [info] Alive Slaves: Mon Jun 10 17:26:56 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Mon Jun 10 17:26:56 2019 - [info] GTID ON Mon Jun 10 17:26:56 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Mon Jun 10 17:26:56 2019 - [info] Primary candidate for the new Master (candidate_master is set) Mon Jun 10 17:26:56 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Mon Jun 10 17:26:56 2019 - [info] GTID ON Mon Jun 10 17:26:56 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Mon Jun 10 17:26:56 2019 - [info] Primary candidate for the new Master (candidate_master is set) Mon Jun 10 17:26:56 2019 - [info] Checking slave configurations.. Mon Jun 10 17:26:56 2019 - [info] read_only=1 is not set on slave 172.28.18.71(172.28.18.71:3306). Mon Jun 10 17:26:56 2019 - [info] Checking replication filtering settings.. Mon Jun 10 17:26:56 2019 - [info] Replication filtering check ok. Mon Jun 10 17:26:56 2019 - [info] Master is down! Mon Jun 10 17:26:56 2019 - [info] Terminating monitoring script. Mon Jun 10 17:26:56 2019 - [info] Got exit code 20 (Master dead). Mon Jun 10 17:26:56 2019 - [info] MHA::MasterFailover version 0.58. Mon Jun 10 17:26:56 2019 - [info] Starting master failover. Mon Jun 10 17:26:56 2019 - [info] Mon Jun 10 17:26:56 2019 - [info] * Phase 1: Configuration Check Phase.. Mon Jun 10 17:26:56 2019 - [info] Mon Jun 10 17:26:57 2019 - [info] GTID failover mode = 1 Mon Jun 10 17:26:57 2019 - [info] Dead Servers: Mon Jun 10 17:26:57 2019 - [info] 172.28.18.69(172.28.18.69:3306) Mon Jun 10 17:26:57 2019 - [info] Checking master reachability via MySQL(double check)... Mon Jun 10 17:26:57 2019 - [info] ok. Mon Jun 10 17:26:57 2019 - [info] Alive Servers: Mon Jun 10 17:26:57 2019 - [info] 172.28.18.71(172.28.18.71:3306) Mon Jun 10 17:26:57 2019 - [info] 172.28.18.78(172.28.18.78:3306) Mon Jun 10 17:26:57 2019 - [info] Alive Slaves: Mon Jun 10 17:26:57 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Mon Jun 10 17:26:57 2019 - [info] GTID ON Mon Jun 10 17:26:57 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Mon Jun 10 17:26:57 2019 - [info] Primary candidate for the new Master (candidate_master is set) Mon Jun 10 17:26:57 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Mon Jun 10 17:26:57 2019 - [info] GTID ON Mon Jun 10 17:26:57 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Mon Jun 10 17:26:57 2019 - [info] Primary candidate for the new Master (candidate_master is set) Mon Jun 10 17:26:57 2019 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln310] Last failover was done at 2019/06/10 14:15:44. Current time is too early to do failover again. If you want to do failover, manually remove /etc/masterha//app1.failover.complete and run this script again. Mon Jun 10 17:26:57 2019 - [error][/usr/local/share/perl5/MHA/ManagerUtil.pm, ln177] Got ERROR: at /usr/local/bin/masterha_manager line 65.
報錯:Current time is too early to do failover again,manually remove /etc/masterha/app1.failover.complete and run this script again
提示刪除/etc/masterha//app1.failover.complete這個文件
[root@localhost scripts]# rm -rf /etc/masterha/app1.failover.complete [root@localhost scripts]#
再次啓動MHA,再次模擬172.28.18.69主庫mysql故障
[root@server-1 ~]# mysqladmin -uroot -p -S /home/mysql-5.7.26/run/mysql.sock shutdown Enter password: 2019-06-11T01:49:32.127585Z mysqld_safe mysqld from pid file /home/mysql-5.7.26/run/mysqld.pid ended [1]+ 完成 mysqld_safe --defaults-file=/etc/my.cnf --user=mysql
查看172.28.18.71上的manager.log日誌
Tue Jun 11 09:49:22 2019 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query) Tue Jun 11 09:49:22 2019 - [info] Executing secondary network check script: masterha_secondary_check -s 172.28.18.71 -s 172.28.18.69 -s 172.28.18.78 --user=root --master_host=172.28.18.69 --master_ip=172.28.18.69 --master_port=3306 --master_user=root --master_password=Zaq1xsw@ --ping_type=SELECT Tue Jun 11 09:49:22 2019 - [info] Executing SSH check script: exit 0 Tue Jun 11 09:49:23 2019 - [info] HealthCheck: SSH to 172.28.18.69 is reachable. Tue Jun 11 09:49:23 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.28.18.69' (111)) Tue Jun 11 09:49:23 2019 - [warning] Connection failed 2 time(s).. Monitoring server 172.28.18.71 is reachable, Master is not reachable from 172.28.18.71. OK. Monitoring server 172.28.18.69 is reachable, Master is not reachable from 172.28.18.69. OK. Tue Jun 11 09:49:24 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.28.18.69' (111)) Tue Jun 11 09:49:24 2019 - [warning] Connection failed 3 time(s).. Monitoring server 172.28.18.78 is reachable, Master is not reachable from 172.28.18.78. OK. Tue Jun 11 09:49:24 2019 - [info] Master is not reachable from all other monitoring servers. Failover should start. Tue Jun 11 09:49:25 2019 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.28.18.69' (111)) Tue Jun 11 09:49:25 2019 - [warning] Connection failed 4 time(s).. Tue Jun 11 09:49:25 2019 - [warning] Master is not reachable from health checker! Tue Jun 11 09:49:25 2019 - [warning] Master 172.28.18.69(172.28.18.69:3306) is not reachable! Tue Jun 11 09:49:25 2019 - [warning] SSH is reachable. Tue Jun 11 09:49:25 2019 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/masterha/app1.cnf again, and trying to connect to all servers to check server status.. Tue Jun 11 09:49:25 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Jun 11 09:49:25 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Tue Jun 11 09:49:25 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Tue Jun 11 09:49:26 2019 - [info] GTID failover mode = 1 Tue Jun 11 09:49:26 2019 - [info] Dead Servers: Tue Jun 11 09:49:26 2019 - [info] 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:26 2019 - [info] Alive Servers: Tue Jun 11 09:49:26 2019 - [info] 172.28.18.71(172.28.18.71:3306) Tue Jun 11 09:49:26 2019 - [info] 172.28.18.78(172.28.18.78:3306) Tue Jun 11 09:49:26 2019 - [info] Alive Slaves: Tue Jun 11 09:49:26 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:26 2019 - [info] GTID ON Tue Jun 11 09:49:26 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:26 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:26 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:26 2019 - [info] GTID ON Tue Jun 11 09:49:26 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:26 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:26 2019 - [info] Checking slave configurations.. Tue Jun 11 09:49:26 2019 - [info] read_only=1 is not set on slave 172.28.18.71(172.28.18.71:3306). Tue Jun 11 09:49:26 2019 - [info] Checking replication filtering settings.. Tue Jun 11 09:49:26 2019 - [info] Replication filtering check ok. Tue Jun 11 09:49:26 2019 - [info] Master is down! Tue Jun 11 09:49:26 2019 - [info] Terminating monitoring script. Tue Jun 11 09:49:26 2019 - [info] Got exit code 20 (Master dead). Tue Jun 11 09:49:26 2019 - [info] MHA::MasterFailover version 0.58. Tue Jun 11 09:49:26 2019 - [info] Starting master failover. Tue Jun 11 09:49:26 2019 - [info] Tue Jun 11 09:49:26 2019 - [info] * Phase 1: Configuration Check Phase.. Tue Jun 11 09:49:26 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] GTID failover mode = 1 Tue Jun 11 09:49:28 2019 - [info] Dead Servers: Tue Jun 11 09:49:28 2019 - [info] 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Checking master reachability via MySQL(double check)... Tue Jun 11 09:49:28 2019 - [info] ok. Tue Jun 11 09:49:28 2019 - [info] Alive Servers: Tue Jun 11 09:49:28 2019 - [info] 172.28.18.71(172.28.18.71:3306) Tue Jun 11 09:49:28 2019 - [info] 172.28.18.78(172.28.18.78:3306) Tue Jun 11 09:49:28 2019 - [info] Alive Slaves: Tue Jun 11 09:49:28 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] Starting GTID based failover. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] ** Phase 1: Configuration Check Phase completed. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 2: Dead Master Shutdown Phase.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] Forcing shutdown so that applications never connect to the current master.. Tue Jun 11 09:49:28 2019 - [info] Executing master IP deactivation script: Tue Jun 11 09:49:28 2019 - [info] /etc/masterha/scripts/master_ip_failover --orig_master_host=172.28.18.69 --orig_master_ip=172.28.18.69 --orig_master_port=3306 --command=stopssh --ssh_user=root --orig_master_ssh_port=25601 Unknown option: orig_master_ssh_port IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Disabling the VIP on old master: 172.28.18.69 Tue Jun 11 09:49:28 2019 - [info] done. Tue Jun 11 09:49:28 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master. Tue Jun 11 09:49:28 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 3: Master Recovery Phase.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] The latest binary log file/position on all slaves is master.000006:194 Tue Jun 11 09:49:28 2019 - [info] Latest slaves (Slaves that received relay log files to the latest): Tue Jun 11 09:49:28 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] The oldest binary log file/position on all slaves is master.000006:194 Tue Jun 11 09:49:28 2019 - [info] Oldest slaves: Tue Jun 11 09:49:28 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 3.3: Determining New Master Phase.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] Searching new master from slaves.. Tue Jun 11 09:49:28 2019 - [info] Candidate masters from the configuration file: Tue Jun 11 09:49:28 2019 - [info] 172.28.18.71(172.28.18.71:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 09:49:28 2019 - [info] GTID ON Tue Jun 11 09:49:28 2019 - [info] Replicating from 172.28.18.69(172.28.18.69:3306) Tue Jun 11 09:49:28 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 09:49:28 2019 - [info] Non-candidate masters: Tue Jun 11 09:49:28 2019 - [info] Searching from candidate_master slaves which have received the latest relay log events.. Tue Jun 11 09:49:28 2019 - [info] New master is 172.28.18.71(172.28.18.71:3306) Tue Jun 11 09:49:28 2019 - [info] Starting master failover.. Tue Jun 11 09:49:28 2019 - [info] From: 172.28.18.69(172.28.18.69:3306) (current master) +--172.28.18.71(172.28.18.71:3306) +--172.28.18.78(172.28.18.78:3306) To: 172.28.18.71(172.28.18.71:3306) (new master) +--172.28.18.78(172.28.18.78:3306) Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 3.3: New Master Recovery Phase.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] Waiting all logs to be applied.. Tue Jun 11 09:49:28 2019 - [info] done. Tue Jun 11 09:49:28 2019 - [info] Getting new master's binlog name and position.. Tue Jun 11 09:49:28 2019 - [info] slave-71.000005:194 Tue Jun 11 09:49:28 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.28.18.71', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Tue Jun 11 09:49:28 2019 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: slave-71.000005, 194, d24d8a53-880d-11e9-b1f3-842b2b5cdc15:1-6, ee3e292b-866b-11e9-9df8-14feb5dc2c77:1-15 Tue Jun 11 09:49:28 2019 - [info] Executing master IP activate script: Tue Jun 11 09:49:28 2019 - [info] /etc/masterha/scripts/master_ip_failover --command=start --ssh_user=root --orig_master_host=172.28.18.69 --orig_master_ip=172.28.18.69 --orig_master_port=3306 --new_master_host=172.28.18.71 --new_master_ip=172.28.18.71 --new_master_port=3306 --new_master_user='root' --orig_master_ssh_port=25601 --new_master_ssh_port=25601 --new_master_password=xxx Unknown option: new_master_user Unknown option: orig_master_ssh_port Unknown option: new_master_ssh_port Unknown option: new_master_password IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Enabling the VIP - 172.28.18.70/24 on the new master - 172.28.18.71 Tue Jun 11 09:49:28 2019 - [info] OK. Tue Jun 11 09:49:28 2019 - [info] ** Finished master recovery successfully. Tue Jun 11 09:49:28 2019 - [info] * Phase 3: Master Recovery Phase completed. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 4: Slaves Recovery Phase.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] * Phase 4.1: Starting Slaves in parallel.. Tue Jun 11 09:49:28 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] -- Slave recovery on host 172.28.18.78(172.28.18.78:3306) started, pid: 20990. Check tmp log /etc/masterha//172.28.18.78_3306_20190611094926.log if it takes time.. Tue Jun 11 09:49:35 2019 - [info] Tue Jun 11 09:49:35 2019 - [info] Log messages from 172.28.18.78 ... Tue Jun 11 09:49:35 2019 - [info] Tue Jun 11 09:49:28 2019 - [info] Resetting slave 172.28.18.78(172.28.18.78:3306) and starting replication from the new master 172.28.18.71(172.28.18.71:3306).. Tue Jun 11 09:49:29 2019 - [info] Executed CHANGE MASTER. Tue Jun 11 09:49:35 2019 - [info] Slave started. Tue Jun 11 09:49:35 2019 - [info] gtid_wait(d24d8a53-880d-11e9-b1f3-842b2b5cdc15:1-6, ee3e292b-866b-11e9-9df8-14feb5dc2c77:1-15) completed on 172.28.18.78(172.28.18.78:3306). Executed 0 events. Tue Jun 11 09:49:35 2019 - [info] End of log messages from 172.28.18.78. Tue Jun 11 09:49:35 2019 - [info] -- Slave on host 172.28.18.78(172.28.18.78:3306) started. Tue Jun 11 09:49:35 2019 - [info] All new slave servers recovered successfully. Tue Jun 11 09:49:35 2019 - [info] Tue Jun 11 09:49:35 2019 - [info] * Phase 5: New master cleanup phase.. Tue Jun 11 09:49:35 2019 - [info] Tue Jun 11 09:49:35 2019 - [info] Resetting slave info on the new master.. Tue Jun 11 09:49:35 2019 - [info] 172.28.18.71: Resetting slave info succeeded. Tue Jun 11 09:49:35 2019 - [info] Master failover to 172.28.18.71(172.28.18.71:3306) completed successfully. Tue Jun 11 09:49:35 2019 - [info] ----- Failover Report ----- app1: MySQL Master failover 172.28.18.69(172.28.18.69:3306) to 172.28.18.71(172.28.18.71:3306) succeeded Master 172.28.18.69(172.28.18.69:3306) is down! Check MHA Manager logs at localhost.localdomain:/etc/masterha/manager.log for details. Started automated(non-interactive) failover. Invalidated master IP address on 172.28.18.69(172.28.18.69:3306) Selected 172.28.18.71(172.28.18.71:3306) as a new master. 172.28.18.71(172.28.18.71:3306): OK: Applying all logs succeeded. 172.28.18.71(172.28.18.71:3306): OK: Activated master IP address. 172.28.18.78(172.28.18.78:3306): OK: Slave started, replicating from 172.28.18.71(172.28.18.71:3306) 172.28.18.71(172.28.18.71:3306): Resetting slave info succeeded. Master failover to 172.28.18.71(172.28.18.71:3306) completed successfully.
此時日誌顯示主庫172.28.18.69故障,將172.28.18.71升級爲主庫,將VIP172.28.18.70漂移到172.28.18.71上
在172.28.18.71上查看IP
[root@localhost scripts]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 84:2b:2b:5c:dc:15 brd ff:ff:ff:ff:ff:ff inet 172.28.18.71/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet 172.28.18.70/24 brd 172.28.18.255 scope global em1:1 valid_lft forever preferred_lft forever inet6 fe80::e0b8:7d61:e043:692/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:17 brd ff:ff:ff:ff:ff:ff 4: em3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:19 brd ff:ff:ff:ff:ff:ff 5: em4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 84:2b:2b:5c:dc:1b brd ff:ff:ff:ff:ff:ff 6: tunl0@NONE: <NOARP> mtu 1480 qdisc noop state DOWN group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0
IP 已經漂移,在172.28.18.69上查看IP
[root@server-1 ~]# ip a 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 14:fe:b5:dc:2c:77 brd ff:ff:ff:ff:ff:ff inet 172.28.18.69/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet6 fe80::b3e8:e3b2:2242:a2ed/64 scope link noprefixroute valid_lft forever preferred_lft forever 3: em2: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 14:fe:b5:dc:2c:79 brd ff:ff:ff:ff:ff:ff 4: em3: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 14:fe:b5:dc:2c:7b brd ff:ff:ff:ff:ff:ff 5: em4: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc mq state DOWN group default qlen 1000 link/ether 14:fe:b5:dc:2c:7d brd ff:ff:ff:ff:ff:ff
VIP已經刪除
[root@server-1 ~]# mysqld_safe --defaults-file=/etc/my.cnf --user=mysql & [1] 32453 [root@server-1 ~]# 2019-06-11T02:09:02.264893Z mysqld_safe Logging to '/home/mysql-5.7.26/log/mysqld.log'. 2019-06-11T02:09:02.317370Z mysqld_safe Starting mysqld daemon with databases from /home/mysql-5.7.26/data
首先在172.28.18.71上的MHA日誌 manager.log裏查找CHANGE 語句
[root@localhost ~]# cat /etc/masterha/manager.log |grep CHANGE Tue Jun 11 09:49:28 2019 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.28.18.71', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx'; Tue Jun 11 09:49:29 2019 - [info] Executed CHANGE MASTER. [root@localhost ~]#
在172.28.18.69上的mysql裏執行上面的語句使172.28.18.69成爲172.28.18.71的從庫
mysql> CHANGE MASTER TO MASTER_HOST='172.28.18.71', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxxxxxxx'; ERROR 2006 (HY000): MySQL server has gone away No connection. Trying to reconnect... Connection id: 2 Current database: test Query OK, 0 rows affected, 2 warnings (0.16 sec) mysql> start slave; Query OK, 0 rows affected (0.01 sec)
查看從庫狀態
mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.28.18.71 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: slave-71.000005 Read_Master_Log_Pos: 194 Relay_Log_File: server-1-relay-bin.000002 Relay_Log_Pos: 365 Relay_Master_Log_File: slave-71.000005 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 194 Relay_Log_Space: 575 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 71 Master_UUID: d24d8a53-880d-11e9-b1f3-842b2b5cdc15 Master_Info_File: /home/mysql-5.7.26/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: d24d8a53-880d-11e9-b1f3-842b2b5cdc15:1-6, ee3e292b-866b-11e9-9df8-14feb5dc2c77:1-15 Auto_Position: 1 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec) ERROR: No query specified
此時,已經成爲172.28.18.71的從庫了
由於MHA在作故障切換後,會自動退出,因此每次故障恢復後,須要從新啓動MHA
[root@localhost ~]# masterha_manager --conf=/etc/masterha/app1.cnf & [2] 22275 [1] 完成 masterha_manager --conf=/etc/masterha/app1.cnf [root@localhost ~]# Tue Jun 11 10:15:25 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Tue Jun 11 10:15:25 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Tue Jun 11 10:15:25 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
查看MHA狀態
[root@localhost ~]# masterha_check_status --conf=/etc/masterha/app1.cnf & [3] 22318 [root@localhost ~]# app1 (pid:22275) is running(0:PING_OK), master:172.28.18.71
運行中,再查看manager.log日誌
Tue Jun 11 10:15:25 2019 - [info] MHA::MasterMonitor version 0.58. Tue Jun 11 10:15:30 2019 - [info] GTID failover mode = 1 Tue Jun 11 10:15:30 2019 - [info] Dead Servers: Tue Jun 11 10:15:30 2019 - [info] Alive Servers: Tue Jun 11 10:15:30 2019 - [info] 172.28.18.71(172.28.18.71:3306) Tue Jun 11 10:15:30 2019 - [info] 172.28.18.69(172.28.18.69:3306) Tue Jun 11 10:15:30 2019 - [info] 172.28.18.78(172.28.18.78:3306) Tue Jun 11 10:15:30 2019 - [info] Alive Slaves: Tue Jun 11 10:15:30 2019 - [info] 172.28.18.69(172.28.18.69:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 10:15:30 2019 - [info] GTID ON Tue Jun 11 10:15:30 2019 - [info] Replicating from 172.28.18.71(172.28.18.71:3306) Tue Jun 11 10:15:30 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 10:15:30 2019 - [info] 172.28.18.78(172.28.18.78:3306) Version=5.7.26-log (oldest major version between slaves) log-bin:enabled Tue Jun 11 10:15:30 2019 - [info] GTID ON Tue Jun 11 10:15:30 2019 - [info] Replicating from 172.28.18.71(172.28.18.71:3306) Tue Jun 11 10:15:30 2019 - [info] Primary candidate for the new Master (candidate_master is set) Tue Jun 11 10:15:30 2019 - [info] Current Alive Master: 172.28.18.71(172.28.18.71:3306) Tue Jun 11 10:15:30 2019 - [info] Checking slave configurations.. Tue Jun 11 10:15:30 2019 - [info] read_only=1 is not set on slave 172.28.18.69(172.28.18.69:3306). Tue Jun 11 10:15:30 2019 - [info] Checking replication filtering settings.. Tue Jun 11 10:15:30 2019 - [info] binlog_do_db= , binlog_ignore_db= Tue Jun 11 10:15:30 2019 - [info] Replication filtering check ok. Tue Jun 11 10:15:30 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking. Tue Jun 11 10:15:30 2019 - [info] Checking SSH publickey authentication settings on the current master.. Tue Jun 11 10:15:31 2019 - [info] HealthCheck: SSH to 172.28.18.71 is reachable. Tue Jun 11 10:15:31 2019 - [info] 172.28.18.71(172.28.18.71:3306) (current master) +--172.28.18.69(172.28.18.69:3306) +--172.28.18.78(172.28.18.78:3306) Tue Jun 11 10:15:31 2019 - [info] Checking master_ip_failover_script status: Tue Jun 11 10:15:31 2019 - [info] /etc/masterha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.28.18.71 --orig_master_ip=172.28.18.71 --orig_master_port=3306 --orig_master_ssh_port=25601 Unknown option: orig_master_ssh_port IN SCRIPT TEST====/usr/sbin/ifconfig em1:1 down==/usr/sbin/ifconfig em1:1 172.28.18.70/24=== Checking the Status of the script.. OK 2: em1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000 link/ether 84:2b:2b:5c:dc:15 brd ff:ff:ff:ff:ff:ff inet 172.28.18.71/28 brd 172.28.18.79 scope global noprefixroute em1 valid_lft forever preferred_lft forever inet 172.28.18.70/24 brd 172.28.18.255 scope global em1:1 valid_lft forever preferred_lft forever inet6 fe80::e0b8:7d61:e043:692/64 scope link noprefixroute valid_lft forever preferred_lft forever Tue Jun 11 10:15:31 2019 - [info] OK. Tue Jun 11 10:15:31 2019 - [warning] shutdown_script is not defined. Tue Jun 11 10:15:31 2019 - [info] Set master ping interval 1 seconds. Tue Jun 11 10:15:31 2019 - [info] Set secondary check script: masterha_secondary_check -s 172.28.18.71 -s 172.28.18.69 -s 172.28.18.78 Tue Jun 11 10:15:31 2019 - [info] Starting ping health check on 172.28.18.71(172.28.18.71:3306).. Tue Jun 11 10:15:31 2019 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
172.28.18.71(172.28.18.71:3306) (current master)
+--172.28.18.69(172.28.18.69:3306) +--172.28.18.78(172.28.18.78:3306)此時172.28.18.71爲新的主庫,172.28.18.6九、172.28.18.78爲從庫