目錄html
在MySQL故障切換過程當中,MHA能作到在0~30秒以內自動完成數據庫的故障切換操做,而且在進行故障切換的過程當中,MHA能在最大程度上保證數據的一致性,以達到真正意義上的高可用。node
優勢 | 缺點 |
---|---|
由perl語言開發的開源工具 | 須要編寫腳本或利用第三方工具來實現Vip的配置 |
支持基於gtid的複製模式 | MHA啓動後只會對主數據庫進行監控 |
同一個監控節點能夠監控多個集羣 | 須要基於SSH免認證配置,存在必定的安全隱患 |
MHA在進行故障轉移時更不易產生數據丟失 | 沒有提供從服務器的讀負載均衡功能 |
使用gtid時大大簡化複製過程,gtid是徹底基於事務的,只要主服務器上提交了事務,那麼從服務器上就必定會執行該事務mysql
MHA是由兩部分組成MHA manager節點,和MHA node節點.能夠部署在一臺獨立的機器上管理多個集羣,也能夠部署在一臺slave節點上,只管理當前所在的集羣.MHA node運行在每臺mysql服務器和 manager服務器上,MHA manager定時探測master節點狀態,master故障時,自動將擁有最新數據的slave提高爲masterlinux
MHAmaster切換時會試圖從宕機的主服務器上保存二進制日誌文件,最大程度保證數據不丟失,但有必定的機率會丟失數據,例如,若是主服務器硬件故障或沒法經過 ssh 訪問,MHA 無法保存二進制日誌,只進行故障轉移從而丟失了最新的數據。使用 MySQL 5.5 的半同步複製,能夠下降數據丟失的風險。MHA能夠與半同步複製結合起來。若是隻有一個slave已經收到了最新的二進制日誌,MHA能夠將最新的二進制日誌應用於其餘全部的slave服務器上,所以能夠保證全部節點的數據一致性。sql
案例環境就是一個集羣,因此manager節點部署在其中一個slave上就能夠,只管理當前集羣.數據庫
而爲了節省資源本案例使用一臺主庫,一臺備用主庫,主庫空閒時也負責讀操做,外加一臺從庫ubuntu
而因爲yum安裝的版本很差指定,我這裏採用二進制安裝,而且使用lvs來調度讀庫,keepalived高可用lvsvim
測試過程關閉防火牆和selinux安全
主機名 | IP地址 | MHA角色 | mysql角色 |
---|---|---|---|
master | 192.168.111.3 | 主庫 | master |
node1 | 192.168.111.4 | MHA集羣manager節點,從庫 | slave |
node2 | 192.168.111.5 | 備用主庫 | slave |
lvs1 | 192.168.111.6 | lvs,keepalived | |
lvs2 | 192.168.111.7 | lvs,keepalived | |
MHA VIP | 192.168.111.100 | ||
keepalived VIP | 192.168.111.200 |
[root@localhost ~]# vim /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.111.3 master 192.168.111.4 node1 192.168.111.5 node2 192.168.111.6 lvs1 192.168.111.7 lvs2 [root@localhost ~]# scp /etc/hosts root@node1:/etc/ [root@localhost ~]# scp /etc/hosts root@node2:/etc/ [root@localhost ~]# scp /etc/hosts root@lvs1:/etc/ [root@localhost ~]# scp /etc/hosts root@lvs2:/etc/ [root@localhost ~]# hostname master [root@localhost ~]# bash [root@master ~]# uname -n master [root@localhost ~]# hostname node1 [root@localhost ~]# bash [root@node1 ~]# uname -n node1 [root@localhost ~]# hostname node2 [root@localhost ~]# bash [root@node2 ~]# uname -n node2 [root@localhost ~]# hostname lvs1 [root@localhost ~]# bash [root@lvs1 ~]# uname -n lvs1 [root@localhost ~]# hostname lvs2 [root@localhost ~]# bash [root@lvs2 ~]# uname -n lvs2
http://downloads.mariadb.com/MHA/ #下載網址下載MHA-manager和MHA-node 我這裏的版本是mha4mysql-manager-0.56.tar.gz;mha4mysql-node-0.56.tar.gz 自行配置epel源 [root@master ~]# yum install -y perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker #依賴包 [root@master ~]# rpm -q perl-DBD-MySQL.x86_64 perl-DBI.x86_64 perl-CPAN perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker #檢查,必需要所有安裝上 #全部節點安裝node [root@master ~]# tar xf mha4mysql-node-0.56.tar.gz [root@master ~]# cd mha4mysql-node-0.56/ [root@master mha4mysql-node-0.56]# perl Makefile.PL [root@master mha4mysql-node-0.56]# make && make install [root@master ~]# ls -l /usr/local/bin/ 總用量 40 -r-xr-xr-x 1 root root 16346 5月 19 16:21 apply_diff_relay_logs -r-xr-xr-x 1 root root 4807 5月 19 16:21 filter_mysqlbinlog -r-xr-xr-x 1 root root 7401 5月 19 16:21 purge_relay_logs -r-xr-xr-x 1 root root 7395 5月 19 16:21 save_binary_logs #生成的二進制文件 manager的安裝 剛纔已經安了一部分依賴,有缺乏的 [root@node1 mha4mysql-node-0.56]# yum -y install perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-DBI perl-Time-HiRes perl-Config-Tiny [root@node1 mha4mysql-node-0.56]# rpm -q perl perl-Log-Dispatch perl-Parallel-ForkManager perl-DBD-MySQL perl-DBI perl-Time-HiRes perl-Config-Tiny perl-5.16.3-294.el7_6.x86_64 perl-Log-Dispatch-2.41-1.el7.1.noarch perl-Parallel-ForkManager-1.18-2.el7.noarch perl-DBD-MySQL-4.023-6.el7.x86_64 perl-DBI-1.627-4.el7.x86_64 perl-Time-HiRes-1.9725-3.el7.x86_64 perl-Config-Tiny-2.14-7.el7.noarch [root@node1 ~]# tar xf mha4mysql-manager-0.56.tar.gz [root@node1 ~]# cd mha4mysql-manager-0.56/ [root@node1 mha4mysql-manager-0.56]# perl Makefile.PL && make && make install [root@node1 mha4mysql-manager-0.56]# ls /usr/local/bin/ apply_diff_relay_logs masterha_check_ssh masterha_manager masterha_secondary_check save_binary_logs filter_mysqlbinlog masterha_check_status masterha_master_monitor masterha_stop masterha_check_repl masterha_conf_host masterha_master_switch purge_relay_logs
在manager上操做 [root@node1 ~]# ssh-keygen -t rsa Generating public/private rsa key pair. Enter file in which to save the key (/root/.ssh/id_rsa): Created directory '/root/.ssh'. Enter passphrase (empty for no passphrase): Enter same passphrase again: Your identification has been saved in /root/.ssh/id_rsa. Your public key has been saved in /root/.ssh/id_rsa.pub. The key fingerprint is: SHA256:qvOacPMWP+iO4BcxPtHJkVDdJXE4HNklCxlPkWShdMM root@node1 The key's randomart image is: +---[RSA 2048]----+ | .o.o oB%%=. | | o ..BXE+ | | o o ..o | | + + | | . + S | | + .. | | o oo.+ | | . +o*o o | | ..=B= . | +----[SHA256]-----+ #以上都是回車我按照的默認的來的,也能夠本身指定選項 [root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@master [root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2 [root@node1 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1 #邏輯上雖然本身就是node1,不會影響什麼,可是按照個人思路node1也應該傳遞公鑰,感興趣的能夠研究一下ssh的實現原理 [root@node1 ~]# ssh node1 Last login: Sat Apr 27 13:33:49 2019 from 192.168.111.1 [root@node1 ~]# exit 登出 Connection to node1 closed. [root@node1 ~]# ssh node2 Last login: Thu Apr 18 22:55:10 2019 from 192.168.111.1 [root@node2 ~]# exit 登出 Connection to node2 closed. [root@node1 ~]# ssh master Last login: Sun May 19 16:00:20 2019 from 192.168.111.1 #保險起見,我一個個試了一遍 每一個節點上也須要分發公鑰,原本只須要分發給node節點,在本案例中manager不是部署在單獨服務器上,而是部署在一個node節點上,因此,也要給它分發 [root@master ~]# ssh-keygen -t rsa [root@master ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1 [root@master ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node2 [root@node2 ~]# ssh-keygen -t rsa [root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@master [root@node2 ~]# ssh-copy-id -i /root/.ssh/id_rsa.pub root@node1
三個節點安裝mysql二進制安裝 yum -y install libaio wget http://mirrors.sohu.com/mysql/MySQL-5.7/mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz useradd -M -s /sbin/nologin mysql tar zxf mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz mv mysql-5.7.24-linux-glibc2.12-x86_64 /usr/local/mysql chown -R mysql:mysql /usr/local/mysql ln -s /usr/local/mysql/bin/* /usr/local/bin/ /usr/local/mysql/bin/mysqld --user=mysql --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --initialize cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqld 將隨機生成的登陸密碼記錄下來 2019-05-18T08:43:11.094845Z 1 [Note] A temporary password is generated for root@localhost: 2Gk75Zvp&!-y #能夠啓動服務後,使用mysql -u root -p'舊密碼' password '新密碼'更改密碼 master: vim /etc/my.cnf [mysqld] datadir=/usr/local/mysql/data socket=/usr/local/mysql/data/mysql.sock server-id=1 log-bin=mysql-binlog log-slave-updates=true symbolic-links=0 [mysqld_safe] log-error=/usr/local/mysql/data/mysql.log pid-file=/usr/local/mysql/data/mysql.pid node1: [root@node1 ~]# vim /etc/my.cnf [mysqld] datadir=/usr/local/mysql/data socket=/tmp/mysql.sock server-id=2 relay-log=relay-log-bin relay-log-index=slave-relay-bin.index symbolic-links=0 [mysqld_safe] log-error=/usr/local/mysql/data/mysql.log pid-file=/usr/local/mysql/data/mysql.pid node2: [root@node2 mysql]# vim /etc/my.cnf [mysqld] datadir=/usr/local/mysql/data socket=/tmp/mysql.sock server-id=3 relay-log=relay-log-bin relay-log-index=slave-relay-bin.index symbolic-links=0 [mysqld_safe] log-error=/usr/local/mysql/data/mysql.log pid-file=/usr/local/mysql/data/mysql.pid #主上操做 mysql> grant replication slave on *.* to 'myslave'@'192.168.111.%' identified by '123456'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.10 sec) mysql> show master status; +---------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +---------------------+----------+--------------+------------------+-------------------+ | mysql-binlog.000002 | 864 | | | | +---------------------+----------+--------------+------------------+-------------------+ 因爲我這裏是測試環境,以前也沒什麼數據,就不對原本已經存在的數據進行備份了. node1,node2: mysql> change master to -> master_host='192.168.111.3', -> master_user='myslave', -> master_password='123456', -> master_log_file='mysql-binlog.000002', -> master_log_pos=864; Query OK, 0 rows affected, 2 warnings (0.01 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G; Slave_IO_Running: Yes Slave_SQL_Running: Yes 而後建立表,庫,到從庫上查看測試.這裏再也不演示
[root@master plugin]# ll -lh semisync_* -rwxr-xr-x 1 mysql mysql 692K 10月 4 2018 semisync_master.so -rwxr-xr-x 1 mysql mysql 149K 10月 4 2018 semisync_slave.so #這是半同步的插件 [root@master plugin]# mysql -u root -p123456 #所有安裝以下 mysql> install plugin rpl_semi_sync_master soname 'semisync_master.so'; mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so'; #安裝插件 mysql> SELECT PLUGIN_NAME, PLUGIN_STATUS FROM INFORMATION_SCHEMA.PLUGINS WHERE PLUGIN_NAME LIKE '%semi%'; +----------------------+---------------+ | PLUGIN_NAME | PLUGIN_STATUS | +----------------------+---------------+ | rpl_semi_sync_slave | ACTIVE | | rpl_semi_sync_master | ACTIVE | +----------------------+---------------+ set global rpl_semi_sync_master_enabled=on ; #主上開啓插件 set global rpl_semi_sync_slave_enabled=on ; #node1備上開啓 mysql> set global rpl_semi_sync_slave_enabled=on ; mysql> set global rpl_semi_sync_master_enabled=on ; #node2上兩個都打開,它也是備用的主
以上配置重啓mysql就會失效,添加到配置文件中可避免bash
主: vim /etc/my.cnf plugin-load=rpl_semi_sync_master=semisync_master.so plugin-load=rpl_semi_sync_slave=semisync_slave.so rpl_semi_sync_master_enabled=on node1: vim /etc/my.cnf plugin-load=rpl_semi_sync_master=semisync_master.so plugin-load=rpl_semi_sync_slave=semisync_slave.so rpl_semi_sync_slave_enabled=on node2: vim /etc/my.cnf plugin-load=rpl_semi_sync_master=semisync_master.so plugin-load=rpl_semi_sync_slave=semisync_slave.so rpl_semi_sync_slave_enabled=on rpl_semi_sync_master_enabled=on
mysql> create database qiao; Query OK, 1 row affected (0.50 sec) mysql> SHOW GLOBAL STATUS LIKE '%semi%'; +--------------------------------------------+--------+ | Variable_name | Value | +--------------------------------------------+--------+ | Rpl_semi_sync_master_yes_tx | 1 | #作一些測試操做,改參數會變大,隨後在salve上查看是否同步
node2和node1同樣: mysql> set global read_only=1; #該命令只限制普通用戶,不包括root等具備supper權限的用戶,要想拒絕全部用戶能夠經過"flush tables with read lock"即誰也沒有辦法進行寫入了
主: mysql> grant all privileges on *.* to 'root'@'192.168.111.%' identified by '123456'; mysql> flush privileges; 其他節點查看以下: mysql> show grants for root@'192.168.111.%'; +-------------------------------------------------------+ | Grants for root@192.168.111.% | +-------------------------------------------------------+ | GRANT ALL PRIVILEGES ON *.* TO 'root'@'192.168.111.%' | +-------------------------------------------------------+ #能夠看到受權也是更改了數據庫的數據,因此也複製完成了
[root@node1 ~]# vim /etc/masterha/app1.cnf [server default] manager_workdir=/var/log/masterha/app1 #mha manager生成的相關狀態文件的絕對路徑,若是沒有設置,則默認使用/var/tmp manager_log=/var/log/masterha/app1/manager.log #mha manager生成的日誌據對路徑,若是沒有設置,mha manager將打印在標準輸出,標準錯誤輸出上,如:當mha manager執行故障轉移的時候,這些信息就會打印 master_binlog_dir=/usr/local/mysql/data #在master上生成binlog的絕對路徑,這個參數用在master掛了,可是ssh還可達的時候,從主庫的這個路徑下讀取和複製必須的binlog events,這個參數是必須的,由於master的mysqld掛掉以後,沒有辦法自動識別master的binlog存放目錄。默認狀況下,master_binlog_dir的值是/var/lib/mysql,/var/log/mysql,/var/lib/mysql目錄是大多數mysql分支默認的binlog輸出目錄,而 /var/log/mysql是ubuntu包的默認binlog輸出目錄,這個參數能夠設置多個值,用逗號隔開 master_ip_failover_script=/usr/local/bin/master_ip_failover #本身編寫一個腳原本透明地把應用程序鏈接到新主庫上,用於切換VIP轉移 password=123456 user=root #目標mysql實例的管理賬號,儘可能是root用戶,由於運行全部的管理命令(如:stop slave,change master,reset slave)須要使用,默認是root,以及密碼 ping_interval=1 #這個參數表示mha manager多久ping(執行select ping sql語句)一次master,連續三個丟失ping鏈接,mha master就斷定mater死了,所以,經過4次ping間隔的最大時間的機制來發現故障,默認是3,表示間隔是3秒 remote_workdir=/usr/local/mysql/data #每個MHA node(指的是mysql server節點)生成日誌文件的工做路徑,這個路徑是絕對路徑,若是該路徑目錄不存在,則會自動建立,若是沒有權限訪問這個路徑,那麼MHA將終止後續過程,另外,你須要關心一下這個路徑下的文件系統是否有足夠的磁盤空間,默認值是/var/tmp repl_password=123456 repl_user=myslave #在全部slave上執行change master的複製用戶名及密碼,這個用戶最好是在主庫上擁有replication slave權限 [server1] hostname=master #主機名或者ip地址 port=3306 #mysql端口 [server2] hostname=node2 candidate_master=1 #從不一樣的從庫服務器中,提高一個可靠的機器爲新主庫,(好比:RAID 10比RAID0的從庫更可靠),能夠經過在配置文件中對應的從庫服務器的配置段下添加candidate_master=1來提高這個從庫被提高爲新主庫的優先級(這個從庫要開始binlog,以及沒有顯著的複製延遲,若是不知足這兩個條件,也並不會在主庫掛掉的時候成爲新主庫,因此,這個參數只是提高了優先級,並非說指定了這個參數就必定會成爲新主庫) port=3306 check_repl_delay=0 #默認狀況下,若是從庫落後主庫100M的relay logs,MHA不會選擇這個從庫做爲新主庫,由於它會增長恢復的時間,設置這個參數爲0,MHA在選擇新主庫的時候,則忽略複製延遲,這個選項用在你使用candidate_master=1 明確指定須要哪一個從庫做爲新主庫的時候使用。 [server3] hostname=node1 port=3306 node2操做: [root@node2 ~]# vim /etc/my.cnf log-bin=mysql-binlog #開啓二進制日誌,並重啓服務
[root@node1 ~]# vim /usr/local/bin/masterha_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, ); my $vip = '192.168.111.100'; my $key = "1"; my $ssh_start_vip = "/sbin/ifconfig ens32:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig ens32:$key down"; $ssh_user = "root"; GetOptions( 'command=s' => \$command, 'ssh_user=s' => \$ssh_user, 'orig_master_host=s' => \$orig_master_host, 'orig_master_ip=s' => \$orig_master_ip, 'orig_master_port=i' => \$orig_master_port, 'new_master_host=s' => \$new_master_host, 'new_master_ip=s' => \$new_master_ip, 'new_master_port=i' => \$new_master_port, ); exit &main(); sub main { print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; #eval { # print "Disabling the VIP on old master: $orig_master_host \n"; # &stop_vip(); # $exit_code = 0; #}; eval { print "Disabling the VIP on old master: $orig_master_host \n"; #my $ping=`ping -c 1 10.0.0.13 | grep "packet loss" | awk -F',' '{print $3}' | awk '{print $1}'`; #if ( $ping le "90.0%"&& $ping gt "0.0%" ){ #$exit_code = 0; #} #else { &stop_vip(); # updating global catalog, etc $exit_code = 0; #} }; if ($@) { warn "Got Error: $@\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \n"; `ssh $ssh_user\@$orig_master_ip \" $ssh_start_vip \"`; exit 0; } else { &usage(); exit 1; } } # A simple system call that enable the VIP on the new master sub start_vip() { `ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`; } # A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port -- new_mas ter_host=host --new_master_ip=ip --new_master_port=port\n"; } #以上腳本只須要修改相應的ip以及網卡就好 [root@node1 ~]# chmod +x /usr/local/bin/masterha_ip_failover
MySQL在主從複製場景下默認狀況是:從庫的relaylog會在SQL線程執行完畢以後備自動刪除.可是在本案例MHA場景下,對於某些滯後從庫的恢復依賴於其它從庫的relaylog,所以採用禁用自動刪除功能以及加入計劃任務按期清理的辦法
那麼待會作計劃任務時會採起硬連接的方式,這是由於在文件系統中;咱們一個沒作硬連接的文件對應一個inode節點,刪除的文件是將其的全部數據塊刪除.而作了硬連接的文件至關因而讓多個文件指向同一個inode節點,刪除操做時刪除的僅僅是指向inode指針.這種方法在刪除一些大的文件,數據庫大表中也是常常用到.
==可是阻止清楚relaylog日誌也有其弊端==,看這邊文章
關於如下腳本中使用到的purge_relay_logs工具的使用,詳情請看這篇文檔
mysql> set global relay_log_purge=0; #兩個從節點操做同樣 配置腳本並添加計劃任務(兩從同樣) [root@node1 ~]# vim /opt/purge_relay_log.sh #!/bin/bash user=root passwd=123456 #建立的監控帳號及密碼 port=3306 #端口 log_dir='/usr/local/mysql/data' #relaylog日誌的目錄 work_dir='/tmp' #指定建立relaylog硬連接的位置,默認是/var/tmp,由於不一樣分區之間是不能建立硬連接的,最好指定下硬連接的具體位置,成功執行腳本後,硬連接的中繼日誌文件就會被刪除 purge='/usr/local/bin/purge_relay_logs' if [ ! -d $log_dir ] then mkdir $log_dir -p fi $purge --user=$user --password=$passwd --disable_relay_log_purge --port=$port --workdir=$work_dir >> $log_dir/purge_relay_logs.log 2>&1 [root@node1 ~]# chmod +x /opt/purge_relay_log.sh [root@node1 ~]# crontab -e 0 4 * * * /bin/bash /opt/purge_relay_log.sh 手動執行一遍 [root@node1 data]# purge_relay_logs --user=root --password=123456 --disable_relay_log_purge --port=3306 --workdir=/tmp 2019-04-27 23:50:55: purge_relay_logs script started. Found relay_log.info: /usr/local/mysql/data/relay-log.info Removing hard linked relay log files relay-log-bin* under /tmp.. done. Current relay log file: /usr/local/mysql/data/relay-log-bin.000007 Archiving unused relay log files (up to /usr/local/mysql/data/relay-log-bin.000006) ... Creating hard link for /usr/local/mysql/data/relay-log-bin.000006 under /tmp/relay-log-bin.000006 .. ok. Creating hard links for unused relay log files completed. Executing SET GLOBAL relay_log_purge=1; FLUSH LOGS; sleeping a few seconds so that SQL thread can delete older relay log files (if it keeps up); SE T GLOBAL relay_log_purge=0; .. ok. Removing hard linked relay log files relay-log-bin* under /tmp.. done. 2019-04-27 23:50:58: All relay log purging operations succeeded.
[root@node1 data]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Sat Apr 27 23:55:13 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Sat Apr 27 23:55:13 2019 - [info] Reading application default configurations from /etc/masterha/app1.cnf.. Sat Apr 27 23:55:13 2019 - [info] Reading server configurations from /etc/masterha/app1.cnf.. Sat Apr 27 23:55:13 2019 - [info] Starting SSH connection tests.. Sat Apr 27 23:55:15 2019 - [debug] Sat Apr 27 23:55:14 2019 - [debug] Connecting via SSH from root@node1(192.168.111.4:22) to root@master(192.168.111.3:22).. Sat Apr 27 23:55:15 2019 - [debug] ok. Sat Apr 27 23:55:15 2019 - [debug] Connecting via SSH from root@node1(192.168.111.4:22) to root@node2(192.168.111.5:22).. Sat Apr 27 23:55:15 2019 - [debug] ok. Sat Apr 27 23:55:15 2019 - [debug] Sat Apr 27 23:55:13 2019 - [debug] Connecting via SSH from root@node2(192.168.111.5:22) to root@master(192.168.111.3:22).. Sat Apr 27 23:55:14 2019 - [debug] ok. Sat Apr 27 23:55:14 2019 - [debug] Connecting via SSH from root@node2(192.168.111.5:22) to root@node1(192.168.111.4:22).. Sat Apr 27 23:55:14 2019 - [debug] ok. Sat Apr 27 23:55:15 2019 - [debug] Sat Apr 27 23:55:13 2019 - [debug] Connecting via SSH from root@master(192.168.111.3:22) to root@node2(192.168.111.5:22).. Sat Apr 27 23:55:13 2019 - [debug] ok. Sat Apr 27 23:55:13 2019 - [debug] Connecting via SSH from root@master(192.168.111.3:22) to root@node1(192.168.111.4:22).. Sat Apr 27 23:55:14 2019 - [debug] ok. Sat Apr 27 23:55:15 2019 - [info] All SSH connection tests passed successfully.
[root@node1 data]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Sat Apr 27 23:57:21 2019 - [error][/usr/local/share/perl5/MHA/Server.pm, ln383] node2(192.168.111.5:3306): User myslave does not exist or does not have REPLICATION SLAVE privilege! Other slaves can not start replication from this host. Sat Apr 27 23:57:21 2019 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln401] Error happend on checking configurations. at /usr/local/share/perl5/MHA/ServerManager.pm line 1354.Sat Apr 27 23:57:21 2019 - [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln500] Error happened on monitoring servers. #正常的不說,看報錯的 #第一句日誌看出是受權用戶的問題,去111.5查看下 mysql> show grants for myslave@'192.168.111.%'; +-------------------------------------------------------------+ | Grants for myslave@192.168.111.% | +-------------------------------------------------------------+ | GRANT REPLICATION SLAVE ON *.* TO 'myslave'@'192.168.111.%' | +-------------------------------------------------------------+ #主上是有的,從庫沒有,由於我最開始時是在主上受權了以後纔開啓的二進制功能""show master status"就是從那以後纔開始複製的 mysql> grant replication slave on *.* to myslave@'192.168.111.%' identified by '123456'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> flush privileges; Query OK, 0 rows affected (0.01 sec) #主上再次受權,此次由於已經主從複製了,因此其它兩臺從再查看也有了 mysql> show grants for myslave@'192.168.111.%'; +-------------------------------------------------------------+ | Grants for myslave@192.168.111.% | +-------------------------------------------------------------+ | GRANT REPLICATION SLAVE ON *.* TO 'myslave'@'192.168.111.%' | +-------------------------------------------------------------+ 再次執行 [root@node1 data]# masterha_check_repl --conf=/etc/masterha/app1.cnf 還報錯,繼續解決 [error][/usr/local/share/perl5/MHA/MasterMonitor.pm, ln401] Error happend on checking configurations. Can't exec "/usr/local/bin/master_ip_failover": 沒有那個文件或目錄 at /usr/local/share/perl5/MHA/ManagerUtil.pm line 68. #監控腳本好像放的位置不對 [root@node1 data]# ll /usr/local/bin/masterha_ip_failover #名字和程序定義的不一樣,更改下 [root@node1 data]# mv /usr/local/bin/masterha_ip_failover /usr/local/bin/master_ip_failover 再次重試 [root@node1 data]# masterha_check_repl --conf=/etc/masterha/app1.cnf MySQL Replication Health is OK 最下面報這個,只能說明暫時沒問題了,咱們繼續進行下去
兩種方式,一種經過keepalived或heartbeat管VIP轉移,另外一種爲命令方式
本案例採起命令方式
[root@node1 data]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 is stopped(2:NOT_RUNNING). #檢查manager狀態若是正常會顯示"PING_OK",不然會顯示"NOT_RUNNING",表明 MHA 監控沒有開啓。咱們剛纔所作的都是預先進行手動測試,並非打開MHA監控 [root@node1 data]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover > /var/log/masterha/app1/manager.log 2>&1 & [1] 67827 #--remove_dead_master_conf:若是設置此參數,當成功failover後,MHA manager將會自動刪除配置文件中關於dead master的配置選項。 #--ignore_last_failover:在缺省狀況下,若是 MHA 檢測到連續發生宕機,且兩次宕機間隔不足 8 小時的話,則不會進行 Failover,之因此這樣限制是爲了不 ping-pong 效應。該參數表明忽略上次 MHA 觸發切換產生的文件,默認狀況下,MHA 發生切換後會在日誌目錄,也就是上面我設置的/data 產生 app1.failover.complete 文件,下次再次切換的時候若是發現該目錄下存在該文件將不容許觸發切換,除非在第一次切換後收到刪除該文件,爲了方便,這裏設置爲--ignore_last_failover。 [root@node1 data]# masterha_check_status --conf=/etc/masterha/app1.cnf app1 (pid:67827) is running(0:PING_OK), master:master #再次查看狀態
我在master上已經看到了VIP,如今測試切換VIP,master上關閉mysql服務/etc/init.d/mysqld stop
node2是備用主,查看VIP是否轉移
[root@node2 ~]# ip a | grep ens32 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32 inet 192.168.111.100/24 brd 192.168.111.255 scope global secondary ens32:1 #轉移成功
node1是從庫,查看下主從複製的狀況
[root@node1 ~]# mysql -uroot -p123456 mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.111.5 Master_User: myslave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-binlog.000004 Read_Master_Log_Pos: 154 Relay_Log_File: relay-log-bin.000002 Relay_Log_Pos: 323 Relay_Master_Log_File: mysql-binlog.000004 Slave_IO_Running: Yes Slave_SQL_Running: Yes #指定的主庫已經更改成192.168.111.5 [root@node1 ~]# jobs -l [2]+ 73464 中止 vim /usr/local/bin/master_ip_failover #該腳本已經切換了至關於完成了任務,已經中止了,能夠再次運行 [root@node1 ~]# vim /etc/masterha/app1.cnf [server default] manager_log=/var/log/masterha/app1/manager.log manager_workdir=/var/log/masterha/app1 master_binlog_dir=/usr/local/mysql/data master_ip_failover_script=/usr/local/bin/master_ip_failover password=123456 ping_interval=1 remote_workdir=/usr/local/mysql/data repl_password=123456 repl_user=myslave user=root [server2] candidate_master=1 check_repl_delay=0 hostname=node2 port=3306 [server3] hostname=node1 port=3306 #能夠看到server1因爲故障已經從配置文件刪除了
[root@master ~]# /etc/init.d/mysqld start #master修復好mysql啓動服務 [root@master ~]# mysql -uroot -p123456 mysql> stop slave; mysql> change master to -> master_host='192.168.111.5', -> master_user='myslave', -> master_password='123456'; #填寫新主的ip,不寫二進制文件名和位置參數了. mysql> start slave; mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_Running: Yes Slave_SQL_Running: Yes manager上: [root@node1 ~]# vim /etc/masterha/app1.cnf #添加以下 [server1] hostname=master port=3306 [root@node1 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf #再次檢查集羣狀態 MySQL Replication Health is OK. [root@node1 ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover > /var/log/masterha/app1/manager.log 2>&1 &[3] 75013 [root@node1 ~]# jobs -l [3]- 75013 運行中 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover > /var/log/mas terha/app1/manager.log 2>&1 & node2:關閉服務,測試VIP自動切回 [root@node2 ~]# /etc/init.d/mysqld stop Shutting down MySQL............ SUCCESS! [root@node2 ~]# ip a | grep ens32 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32 master:查看 [root@master ~]# ip a| grep ens32 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.111.3/24 brd 192.168.111.255 scope global noprefixroute ens32 inet 192.168.111.100/24 brd 192.168.111.255 scope global secondary ens32:1 node1:從庫查看主從複製狀態,指定主ip已經自動切換 mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.111.3 Master_User: myslave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-binlog.000004 Read_Master_Log_Pos: 398 Relay_Log_File: relay-log-bin.000002 Relay_Log_Pos: 323 Relay_Master_Log_File: mysql-binlog.000004 Slave_IO_Running: Yes Slave_SQL_Running: Yes
而後按照一樣的思路以及配置再將node2加入到MHA中
若出現這樣錯誤: mysql> stop slave; Query OK, 0 rows affected (0.01 sec) mysql> change master to master_host='192.168.111.3', master_user='myslave', master_password='123456'; Query OK, 0 rows affected, 2 warnings (0.11 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.111.3 Master_User: myslave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-binlog.000004 Read_Master_Log_Pos: 842 Relay_Log_File: relay-log-bin.000003 Relay_Log_Pos: 1392 Relay_Master_Log_File: mysql-binlog.000002 Slave_IO_Running: Yes Slave_SQL_Running: No Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 1007 Last_Error: Error 'Can't create database 'qiao'; database exists' on query. Default database: 'qiao'. Query: 'create database qia o' Skip_Counter: 0 Exec_Master_Log_Pos: 1173 Relay_Log_Space: 4470 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 1007 Last_SQL_Error: Error 'Can't create database 'qiao'; database exists' on query. Default database: 'qiao'. Query: 'create database qia o' 解決: mysql> stop slave; Query OK, 0 rows affected (0.00 sec) mysql> set global sql_slave_skip_counter=1; Query OK, 0 rows affected (0.00 sec) mysql> start slave; Query OK, 0 rows affected (0.00 sec) mysql> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.111.3 Master_User: myslave Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-binlog.000004 Read_Master_Log_Pos: 842 Relay_Log_File: relay-log-bin.000006 Relay_Log_Pos: 323 Relay_Master_Log_File: mysql-binlog.000004 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Skip_Counter: 0 Exec_Master_Log_Pos: 842 Relay_Log_Space: 1191 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: 0 Master_SSL_Verify_Server_Cert: No Last_IO_Errno: 0 Last_IO_Error: Last_SQL_Errno: 0 Last_SQL_Error: Replicate_Ignore_Server_Ids: Master_Server_Id: 1 Master_UUID: 9da60612-7a17-11e9-b288-000c2935c4a6 Master_Info_File: /usr/local/mysql/data/master.info SQL_Delay: 0 SQL_Remaining_Delay: NULL Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates Master_Retry_Count: 86400 Master_Bind: Last_IO_Error_Timestamp: Last_SQL_Error_Timestamp: Master_SSL_Crl: Master_SSL_Crlpath: Retrieved_Gtid_Set: Executed_Gtid_Set: Auto_Position: 0 Replicate_Rewrite_DB: Channel_Name: Master_TLS_Version: 1 row in set (0.00 sec)
兩臺機器同樣先安裝上 [root@lvs2 ~]# yum -y install ipvsadm kernel-devel openssl-devel keepalived [root@lvs2 ~]# vim /etc/keepalived/keepalived.conf #這裏keepalived.lvs配置文件不解釋,能夠到其它站點搜索文檔閱讀 ! Configuration File for keepalived global_defs { notification_email { acassen@firewall.loc failover@firewall.loc sysadmin@firewall.loc } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 192.168.200.1 smtp_connect_timeout 30 router_id LVS_DEVEL vrrp_skip_check_adv_addr vrrp_garp_interval 0 vrrp_gna_interval 0 } vrrp_instance VI_1 { state MASTER interface ens32 virtual_router_id 51 priority 100 advert_int 1 authentication { auth_type PASS auth_pass 1111 } virtual_ipaddress { 192.168.111.200 } } virtual_server 192.168.111.200 3306 { delay_loop 6 lb_algo rr lb_kind DR protocol TCP real_server 192.168.111.4 3306 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 3306 } } real_server 192.168.111.5 3306 { weight 1 TCP_CHECK { connect_timeout 10 nb_get_retry 3 delay_before_retry 3 connect_port 3306 } } } } [root@lvs2 ~]# scp /etc/keepalived/keepalived.conf root@lvs1:/etc/keepalived/ #複製到另外一臺機器 lvs1修改以下 [root@lvs1 ~]# vim /etc/keepalived/keepalived.conf 12 router_id LVS_DEVEL1 20 state BACKUP priority 90 [root@lvs2 ~]# systemctl start keepalived.service #分別啓動服務 [root@lvs2 ~]# ip a | grep ens32 2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000 inet 192.168.111.7/24 brd 192.168.111.255 scope global noprefixroute ens32 inet 192.168.111.200/32 scope global ens32 [root@lvs2 ~]# ipvsadm -ln #查看狀態 IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 192.168.111.200:3306 rr -> 192.168.111.4:3306 Route 1 0 0 -> 192.168.111.5:3306 Route 1 0 0
node1: [root@node1 ~]# vim /opt/realserver.sh #!/bin/bash SNS_VIP=192.168.111.200 ifconfig lo:0 $SNS_VIP netmask 255.255.255.255 broadcast $SNS_VIP /sbin/route add -host $SNS_VIP dev lo:0 echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce sysctl -p >/dev/null 2>&1 node2: [root@node1 ~]# vim /opt/realserver.sh #!/bin/bash SNS_VIP=192.168.111.200 ifconfig lo:0 $SNS_VIP netmask 255.255.255.255 broadcast $SNS_VIP /sbin/route add -host $SNS_VIP dev lo:0 echo "1" >/proc/sys/net/ipv4/conf/lo/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/lo/arp_announce echo "1" >/proc/sys/net/ipv4/conf/all/arp_ignore echo "2" >/proc/sys/net/ipv4/conf/all/arp_announce sysctl -p >/dev/null 2>&1 [root@node1 ~]# sh /opt/realserver.sh #在manager,node1機器上鍊接VIP進行測試 [root@node1 ~]# mysql -uroot -p123456 -h192.168.111.200 mysql: [Warning] Using a password on the command line interface can be insecure. Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 749 Server version: 5.7.24 MySQL Community Server (GPL) Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql>