目錄node
MHA的邏輯是,爲了保證其MySQL的高可用,會有一個StandBy狀態的master.在mysql故障切換的過程當中,MHA 能作到在 0~30 秒內自動完成數據庫的故障切換操做,而且在進行故障切換的過程當中,MHA 能最大程度的保證數據的一致性,以達到相對意義上的高可用。mysql
以下圖,整個 MHA 架構分爲sql
Manager節點:shell
masterha_check_ssh
:MHA 依賴的 ssh 環境監測工具;masterha_check_repl
:MYSQL 複製環境檢測工具;masterga_manager
:MHA 服務主程序;masterha_check_status
:MHA 運行狀態探測工具;masterha_master_monitor
:MYSQL master 節點可用性監測工具;masterha_master_swith:master
:節點切換工具;masterha_conf_host
:添加或刪除配置的節點;masterha_stop
:關閉 MHA 服務的工具。Node節點:(這些工具一般由MHA Manager的腳本觸發,無需人爲操做)數據庫
save_binary_logs
:保存和複製 master 的二進制日誌;apply_diff_relay_logs
:識別差別的中繼日誌事件並應用於其餘 slave;purge_relay_logs
:清除中繼日誌(不會阻塞 SQL 線程);自定義擴展:centos
secondary_check_script
:經過多條網絡路由檢測master的可用性;master_ip_failover_script
:更新application使用的masterip;report_script
:發送報告;init_conf_load_script
:加載初始配置參數;master_ip_online_change_script
;更新master節點ip地址。MHA 對 MYSQL 複製環境有特殊要求,例如各節點都要開啓二進制日誌及中繼日誌,各從節點必須顯示啓用其read-only
屬性,並關閉relay_log_purge
功能等,這裏對配置作事先說明。bash
機器名 | IP | 角色 | 備註 |
---|---|---|---|
manager | 172.30.200.100 | manager控制器 | 用於管理和故障切換 |
master | 172.30.200.101 | 數據庫主服務器 | 開啓binlog,relay-log。關閉relay_log_purge |
slave1 | 172.30.200.102 | 數據庫從服務器 | 開啓binlog,relay-log。關閉relay_log_purge |
slave2 | 172.30.200.103 | 數據庫從服務器 | 開啓binlog,relay-log。關閉relay_log_purge |
在各個節點的/etc/hosts文件配置內容添加以下:服務器
[root@localhost ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.30.200.100 arpmgr 172.30.200.101 arpmaster 172.30.200.102 arpslave1 172.30.200.103 arpslave2
建立binlog的目錄網絡
mkdir -p /data/mysqldata/binlog chown -R mysql:mysql /data/mysqldata/binlog
101節點配置:架構
server-id = 200101 log-bin = /data/mysqldata/binlog/mysql-bin binlog_format= row max_binlog_size= 512m relay-log = /data/mysqldata/binlog/relay-bin expire-logs-days = 14 lower_case_table_names = 1 character-set-server = utf8 log_slave_updates = 1
102節點配置:
server-id = 200102 log-bin = /data/mysqldata/binlog/mysql-bin binlog_format= row max_binlog_size= 512m relay-log = /data/mysqldata/binlog/relay-bin expire-logs-days = 14 read_only = ON relay_log_purge = 0 lower_case_table_names = 1 character-set-server = utf8 log_slave_updates = 1
103節點配置:
server-id = 200103 log-bin = /data/mysqldata/binlog/mysql-bin binlog_format= row max_binlog_size= 512m relay-log = /data/mysqldata/binlog/relay-bin read_only = ON relay_log_purge = 0 expire-logs-days = 14 lower_case_table_names = 1 character-set-server = utf8 log_slave_updates = 1
master節點配置:
MariaDB [(none)]>grant replication slave,replication client on *.* to 'repl'@'172.30.200.%' identified by 'repl7101'; MariaDB [(none)]> show master status; +------------------+----------+--------------+------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | +------------------+----------+--------------+------------------+ | mysql-bin.000001 | 548 | | | +------------------+----------+--------------+------------------+ 1 row in set (0.000 sec)
slave節點配置:
grant replication slave,replication client on *.* to 'repl'@'172.30.200.%' identified by 'repl7101'; change master to master_host='172.30.200.101', master_user='repl', master_password='repl7101', master_log_file='mysql-bin.000001', master_log_pos=548; start slave; show slave status\G;
至此,一主多從配置完畢。
能夠在全部節點上面配置,其擁有管理權限,目前只需在master結點上設置權限:
grant all on *.* to 'mhaadmin'@'172.30.%.%' identified by 'mha7101'; grant all on *.* to 'mhaadmin'@'arpmgr' identified by 'mha7101';
四個節點都執行以下語句:
ssh-keygen -t rsa ssh-copy-id -i .ssh/id_rsa.pub root@arpmgr
而後在arpmgr
結點上面,能夠看到authorized_keys
文件的信息內容以下:
[root@localhost .ssh]# cat authorized_keys ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDY3yFhR5uzcFEpd+q+1Uw/cRF9ZRraygms7OZoefLFzY/ydSi6yYuCighG8WquvRep7XDNjFI71HAUagSoXiyPoCe1lqEnzpxSc+fQpIeQqEhUmLJ2bk+R83EskzwRGh+S/D4yp/swWz1vRgUGoTWevLCs33q7ZrsM8i+jB0uwZmzOV+CyQAPW9vLkRjZa4y1sx65lbR0HbdTQWQYZ4IyZauoU8XQjAIOs/CdLw2nBt8dPO53jT7NS7Ywx6eu/Wj9k/sYVVZT3jTb+pBIVs+Du5+tdUDX5aLKzxINpLlqNhorNevoC9iE0Ame1qvYonQfyWQ52Ae0y+58vFfG6PyV3 root@localhost.localdomain ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/ZPihYSC6ArawKRU75aQRVSFsQ5S89SrYHGWdzyluB4spj+UDUmWH1kLGYr715/HD5hh22KdLmIs7R4jviOeao1HK52fpMvklYaNtYRHuV63Zkg5sOLvLfhrHdta9wuHlW1NyWx75+wIl2LvKBRtnSddwf5ZvitJ/kChf2gpNhHAWidyjGsPoJdr0OBCNHvz1y6oON6cnMb07ExaIjptRnkbCOU0QSVjFq4+Jmh8zTTbJC2up50s15gSfWXH0+WLXmJXJGkvgHdSYqw4vJt/l25f5qAKKZsfnyfC0iyct4GyHPF6trpvQ/c2lqr/Rg4xLWgdxlyt4aBJYl5adIRK/ root@localhost.localdomain ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDba26wV0KwQNTb4pKuiFDCcVMNRLGMXSiJC8ucN4/KIqzoOYJ747QL8GL5F8ePnRaZ1rtOwdjnlTiC0a4Tcg4JLs+JSnJgzvepuixmGgSJfLbJ36iN1WFh6fP2GZEDdR7Qum4sBUpQyYJ20Kf9rKfQQv2wq6csK5IlFk/OoO+zTySauLnYvRxvKY2avVDXPPFJvpqimKXn59MIAoJr6YEKvncbYyqvrSgUy7klZDys9IIjYcWfO7VKjQ5bwbHrrKtNbedME+KPQld7e8ZVL66Omik4Z6ip7DQEHRKWMmuBIpL99AgOOjPLbzJFWLUPOwvy3DtmEBnZ+0NVf/1obC11 root@localhost.localdomain ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCwrgZtGC31EgixeY4SVl4h64m1r8LdL3hM4Be2/I+6Xw7hCzZyKKTAFgz9W/ukfx6WmZwoqp1VO/7Jp6KO1FhYOi5u0q6J1KIObFNp+3E6cB2P0q39WqmZpQ9cNPYrbs9U2Ej0L0JwUtf/xLh334PaSlv/LcNy+p1dWya2OqsBeraiXZ4MgEBzcb+0twkpfpD327VgT/mRHPmA6fPRJOOJti1u4isHeotE4i13YIqQYfBfmbfiLdXKAvgI8FuTf0i91Re/FUBOgBfBcJbqIQNR0Nh5wZ/LvNxkstDQvypZIZwiK+wN+aZZOQ7jF/+997Z9QQleC9OOoHOJR7+fisLb root@localhost.localdomain
正好有四個形如ssh-rsa
相關 的密鑰信息。
把如上的公鑰信息,拷貝到其他四臺服務器上面:
scp authorized_keys root@arpmaster:~/.ssh/ scp authorized_keys root@arpslave1:~/.ssh/ scp authorized_keys root@arpslave2:~/.ssh/
測試ssh是否可用
[root@localhost .ssh]# ssh arpmaster [root@localhost ~]# ssh arpslave1 [root@localhost ~]# ssh arpslave2 [root@localhost ~]# ssh arpmgr
mha安裝包分爲兩個,一個是node
,另一個是manager
。
四個節點安裝:mha4mysql-node-0.57-0.el7.centos.noarch.rpm
管理節點安裝:mha4mysql-manager-0.57-0.el7.centos.noarch.rpm
在安裝`mha4mysql-node-0.57-0.el7.centos.noarch.rpm
過程當中,有對perl-DBD-mysql
,perl-DBI
前置依賴,安裝步驟以下:
yum install perl-DBD-mysql perl-DBI
在安裝`mha4mysql-manager-0.57-0.el7.centos.noarch.rpm
過程當中,有對perl
前置依賴,安裝步驟以下:
安裝yum 擴展包 yum install https://dl.fedoraproject.org/pub/epel/epel-release-latest-7.noarch.rpm yum install perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker perl-Config-Tiny perl-Log-Dispatch-* perl-Parallel-ForkManager
而後安裝信息,都成功,以下:
[root@localhost ~]# rpm -ivh mha4mysql-node-0.57-0.el7.centos.noarch.rpm 準備中... ################################# [100%] 正在升級/安裝... 1:mha4mysql-node-0.58-0.el7.centos ################################# [100%] [root@localhost ~]# rpm -ivh mha4mysql-manager-0.57-0.el7.centos.noarch.rpm 準備中... ################################# [100%] 正在升級/安裝... 1:mha4mysql-manager-0.58-0.el7.cent################################# [100%]
0.58中有一個super_read_only不可用在mariadb,因此使用0.57版本。
[root@localhost ~]# cd /etc/mha_master/ [root@localhost mha_master]# vi /etc/mha_master/mha.cnf
配置文件內容以下:
[server default] user=mhaadmin password=mha7101 manager_workdir=/etc/mha_master/app1 manager_log=/etc/mha_master/manager.log remote_workdir=/data/mha_master/app1 repl_user=repl repl_password=repl7101 ping_interval=1 [server1] hostname=172.30.200.101 ssh_port=22 [server2] hostname=172.30.200.102 ssh_port=22 candidate_master=1 [server3] hostname=172.30.200.103 ssh_port=22 no_master=1
在管理節點檢測ssh連通性以下:
[root@localhost ~]# masterha_check_ssh -conf=/etc/mha_master/mha.cnf
有以下日誌,表明正常:
Thu Jan 9 14:43:09 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Thu Jan 9 14:43:09 2020 - [info] Reading application default configuration from /etc/mha_master/mha.cnf.. root@172.30.200.103(172.30.200.103:22) to root@172.30.200.101(172.30.200.101:22).. Thu Jan 9 14:43:11 2020 - [debug] ok. Thu Jan 9 14:43:11 2020 - [debug] Connecting via SSH from root@172.30.200.103(172.30.200.103:22) to root@172.30.200.102(172.30.200.102:22).. Thu Jan 9 14:43:11 2020 - [debug] ok. Thu Jan 9 14:43:12 2020 - [info] All SSH connection tests passed successfully.
檢測MySQL replication是否正常
masterha_check_repl --conf=/etc/mha_master/mha.cnf
有以下日誌,說明正常:
Thu Jan 9 14:44:54 2020 - [info] Slaves settings check done. Thu Jan 9 14:44:54 2020 - [info] 172.30.200.101(172.30.200.101:3306) (current master) +--172.30.200.102(172.30.200.102:3306) +--172.30.200.103(172.30.200.103:3306) Thu Jan 9 14:44:54 2020 - [info] Checking replication health on 172.30.200.102.. Thu Jan 9 14:44:54 2020 - [info] ok. Thu Jan 9 14:44:54 2020 - [info] Checking replication health on 172.30.200.103.. Thu Jan 9 14:44:54 2020 - [info] ok. Thu Jan 9 14:44:54 2020 - [warning] master_ip_failover_script is not defined. Thu Jan 9 14:44:54 2020 - [warning] shutdown_script is not defined. Thu Jan 9 14:44:54 2020 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
啓動mha manager:
nohup masterha_manager --conf=/etc/mha_master/mha.cnf &> /etc/mha_master/manager.log &
檢測master節點狀態:
[root@localhost ~]# masterha_check_status --conf=/etc/mha_master/mha.cnf mha (pid:31709) is running(0:PING_OK), master:172.30.200.101
說明主數據庫172.30.200.101
啓動正常。
關閉mha manager:
masterha_stop -conf=/etc/mha_master/mha.cnf
master
直接kill mysql節點
[root@localhost ~]# ps -ef |grep mysql root 19864 1 0 08:51 ? 00:00:00 /bin/sh /usr/local/mysql/bin/mysqld_safe --datadir=/data/mysqldata --pid-file=/data/mysqldata/localhost.localdomain.pid mysql 19976 19864 0 08:51 ? 00:00:13 /usr/local/mysql/bin/mysqld --basedir=/usr/local/mysql --datadir=/data/mysqldata --plugin-dir=/usr/local/mysql/lib/plugin --user=mysql --log-error=/data/mysqldata/mysqld.log --pid-file=/data/mysqldata/localhost.localdomain.pid --socket=/tmp/mysql.sock root 22166 21525 0 14:55 pts/0 00:00:00 grep --color=auto mysql [root@localhost ~]# kill -9 19864 19976
MHA
轉移日誌。
[root@localhost ~]# tail -f /etc/mha_master/manager.log From: 172.30.200.101(172.30.200.101:3306) (current master) +--172.30.200.102(172.30.200.102:3306) +--172.30.200.103(172.30.200.103:3306) To: 172.30.200.102(172.30.200.102:3306) (new master) +--172.30.200.103(172.30.200.103:3306) Master 172.30.200.101(172.30.200.101:3306) is down! Check MHA Manager logs at localhost.localdomain:/etc/mha_master/manager.log for details. Started automated(non-interactive) failover. The latest slave 172.30.200.102(172.30.200.102:3306) has all relay logs for recovery. Selected 172.30.200.102(172.30.200.102:3306) as a new master. 172.30.200.102(172.30.200.102:3306): OK: Applying all logs succeeded. 172.30.200.103(172.30.200.103:3306): This host has the latest relay log events. Generating relay diff files from the latest slave succeeded. 172.30.200.103(172.30.200.103:3306): OK: Applying all logs succeeded. Slave started, replicating from 172.30.200.102(172.30.200.102:3306) 172.30.200.102(172.30.200.102:3306): Resetting slave info succeeded. Master failover to 172.30.200.102(172.30.200.102:3306) completed successfully.
從上述日誌來看,172.30.200.102
已經成爲了新的master,而172.30.200.103
仍是slave
數據庫。
因爲這裏是實驗環境,能夠不處處mysqldump的備份。若是是生產環境恢復,能夠停掉slave的SQL thread,記住對應的pos的位置,而後備份出數據,保證數據一致性以後,同步數據,恢復損壞的結點。
change master to master_host='172.30.200.102', master_user='repl', master_password='repl7101', master_log_file='mysql-bin.000003', master_log_pos=401;
查看slave狀態:
MariaDB [(none)]> start slave; MariaDB [(none)]> show slave status\G; *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 172.30.200.102 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000003 Read_Master_Log_Pos: 401 Relay_Log_File: relay-bin.000002 Relay_Log_Pos: 555 Relay_Master_Log_File: mysql-bin.000003 Slave_IO_Running: Yes Slave_SQL_Running: Yes
再次啓動,以下:
[root@localhost ~]# nohup masterha_manager --conf=/etc/mha_master/mha.cnf &> /etc/mha_master/manager.log & [root@localhost ~]# masterha_check_status --conf=/etc/mha_master/mha.cnf mha (pid:31905) is running(0:PING_OK), master:172.30.200.101
至此,MHA實驗完畢。因爲生產環境會用到VIP,後續會繼續編寫。
日誌錯誤:
Thu Jan 9 11:31:36 2020 - [info] Connecting to root@172.30.200.102(172.30.200.102:22).. Can't exec "mysqlbinlog": 沒有那個文件或目錄 at /usr/share/perl5/vendor_perl/MHA/BinlogManager.pm line 106. mysqlbinlog version command failed with rc 1:0, please verify PATH, LD_LIBRARY_PATH, and client options
解決方法:
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
日誌錯誤:
Checking if super_read_only is defined and turned on..DBI connect(';host=172.30.200.102;port=3306','mhaadmin',...) failed: Access denied for user 'mhaadmin'@'arpslave1' (using password: YES) at /usr/share/perl5/vendor_perl/MHA/SlaveUtil.pm line 239
解決方法:
manager節點,執行:
grant all on *.* to 'mhaadmin'@'arpmgr' identified by 'mha7101'; grant all on *.* to 'mhaadmin'@'arpmaster' identified by 'mha7101'; grant all on *.* to 'mhaadmin'@'arpslave1' identified by 'mha7101'; grant all on *.* to 'mhaadmin'@'arpslave2' identified by 'mha7101';
日誌以下:
Testing mysql connection and privileges..sh: mysql: 未找到命令 mysql command failed with rc 127:0!
解決方法:
ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql