MySQL集羣搭建(3)-MMM高可用架構

上個文章 MySQL集羣搭建(2)-主主從模式 中咱們知道如何搭建 MySQL 主主從模式,今天這個文章正式進入 MySQL 高可用的架構。html

1 MMM 介紹

1.1 簡介

MMM 是一套支持雙主故障切換以及雙主平常管理的第三方軟件。MMM 由 Perl 開發,用來管理和監控雙主複製,雖然是雙主架構,可是業務上同一時間只容許一個節點進行寫入操做。mysql

MMM 包含兩類角色: writerreader, 分別對應讀寫節點和只讀節點。c++

使用 MMM 管理雙主節點的狀況下,當 writer 節點出現宕機(假定是 master1),程序會自動移除該節點上的讀寫 VIP,切換到 Master2 ,並設置 Master2read_only = 0, 同時,全部 Slave 節點會指向 Master2sql

除了管理雙主節點,MMM 也會管理 Slave 節點,在出現宕機、複製延遲或複製錯誤,MMM 會移除該節點的 VIP,直到節點恢復正常。shell

1.2 組件

MMM 由兩類程序組成數據庫

  • monitor: 監控集羣內數據庫的狀態,在出現異常時發佈切換命令,通常和數據庫分開部署
  • agent: 運行在每一個 MySQL 服務器上的代理進程,monitor 命令的執行者,完成監控的探針工做和具體服務設置,例如設置 VIP、指向新同步節點

其架構以下:
MySQL mmmsegmentfault

1.3 切換流程

以上述架構爲例,描述一下故障轉移的流程,如今假設 Master1 宕機服務器

  1. Monitor 檢測到 Master1 鏈接失敗
  2. Monitor 發送 set_offline 指令到 Master1 的 Agent
  3. Master1 Agent 若是存活,下線寫 VIP,嘗試把 Master1 設置爲 read_only=1
  4. Moniotr 發送 set_online 指令到 Master2
  5. Master2 Agent 接收到指令,執行 select master_pos_wait() 等待同步完畢
  6. Master2 Agent 上線寫 VIP,把 Master2 節點設爲 read_only=0
  7. Monitor 發送更改同步對象的指令到各個 Slave 節點的 Agent
  8. 各個 Slave 節點向新 Master 同步數據

從整個流程能夠看到,若是主節點出現故障,MMM 會自動實現切換,不須要人工干預,同時咱們也能看出一些問題,就是數據庫掛掉後,只是作了切換,不會主動補齊丟失的數據,因此 MMM 會有數據不一致性的風險。網絡

2 MMM 安裝

2.1 yum 安裝

若是服務器能連網或者有合適 yum 源,直接執行如下命令安裝架構

# 增長 yum 源(若是默認 yum 源有,這一步能夠忽略)
yum install epel-release.noarch 
# 在 agent 節點執行
yum install -y mysql-mmm-agent
# 在 monitor 節點執行
yum install -y mysql-mmm-monitor

執行該安裝命令,會安裝如下軟件包或依賴

mysql-mmm-agent.noarch 0:2.2.1-1.el5
libart_lgpl.x86_64 0:2.3.17-4                                                 
mysql-mmm.noarch 0:2.2.1-1.el5                                                
perl-Algorithm-Diff.noarch 0:1.1902-2.el5                                     
perl-DBD-mysql.x86_64 0:4.008-1.rf                                            
perl-DateManip.noarch 0:5.44-1.2.1                                            
perl-IPC-Shareable.noarch 0:0.60-3.el5                                        
perl-Log-Dispatch.noarch 0:2.20-1.el5                                         
perl-Log-Dispatch-FileRotate.noarch 0:1.16-1.el5                              
perl-Log-Log4perl.noarch 0:1.13-2.el5                                         
perl-MIME-Lite.noarch 0:3.01-5.el5                                            
perl-Mail-Sender.noarch 0:0.8.13-2.el5.1                                      
perl-Mail-Sendmail.noarch 0:0.79-9.el5.1                                      
perl-MailTools.noarch 0:1.77-1.el5                                            
perl-Net-ARP.x86_64 0:1.0.6-2.1.el5                                           
perl-Params-Validate.x86_64 0:0.88-3.el5                                      
perl-Proc-Daemon.noarch 0:0.03-1.el5                                          
perl-TimeDate.noarch 1:1.16-5.el5                                             
perl-XML-DOM.noarch 0:1.44-2.el5                                              
perl-XML-Parser.x86_64 0:2.34-6.1.2.2.1                                       
perl-XML-RegExp.noarch 0:0.03-2.el5                                           
rrdtool.x86_64 0:1.2.27-3.el5                                                 
rrdtool-perl.x86_64 0:1.2.27-3.el5

其餘系統安裝方式能夠參考官網

2.2 手動安裝

1). 下載安裝包

進入 MMM 下載頁面 Downloads MMM for MySQL,點擊下載,如圖

下載MMM

下載完成上傳到服務器上

2). 安裝依賴

yum install -y wget perl openssl gcc gcc-c++
wget http://xrl.us/cpanm --no-check-certificate
mv cpanm /usr/bin
chmod 755 /usr/bin/cpanm
cat > /root/list << EOF
install Algorithm::Diff
install Class::Singleton
install DBI
install DBD::mysql
install File::Basename
install File::stat
install File::Temp
install Log::Dispatch
install Log::Log4perl
install Mail::Send
install Net::ARP
install Net::Ping
install Proc::Daemon
install Thread::Queue
install Time::HiRes
EOF
 
for package in `cat /root/list`
do
    cpanm $package
done

3). 安裝

tar -xvf mysql-mmm-2.2.1.tar.gz
cd mysql-mmm-2.2.1
make install
ps: 大部分時候,數據庫機器都是不容許鏈接外網的,這個時候只能把上述依賴的 RPM 包一個個下載下來拷到服務器上

3 數據庫環境準備

操做前已經準備好了一套主主從架構的數據庫,搭建方法能夠參考以往文章,具體信息以下

節點信息

IP 系統 端口 MySQL版本 節點 讀寫 說明
10.0.0.247 Centos6.5 3306 5.7.9 Master 讀寫 主節點
10.0.0.248 Centos6.5 3306 5.7.9 Standby 只讀,可切換爲讀寫 備主節點
10.0.0.249 Centos6.5 3306 5.7.9 Slave 只讀 從節點
10.0.0.24 Centos6.5 - - monitor - MMM Monitor

VIP 信息

簡稱 VIP 類型
RW-VIP 10.0.0.237 讀寫VIP
RO-VIP1 10.0.0.238 讀VIP
RO-VIP2 10.0.0.239 讀VIP

架構圖

主主從

參考配置

Master1

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_mmm/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_mmm
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_mmm/mysql.sock
pid-file = /data/mysql_db/test_mmm/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2473306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 1
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_mmm/mysql-bin
log_bin_index = /data/mysql_log/test_mmm/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_mmm/mysql-relay-bin
relay_log_index=/data/mysql_log/test_mmm/mysql-relay-bin.index
log_error = /data/mysql_log/test_mmm/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Master2

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_mmm/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_mmm
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_mmm/mysql.sock
pid-file = /data/mysql_db/test_mmm/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2483306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

auto_increment_offset = 2
auto_increment_increment = 2

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_mmm/mysql-bin
log_bin_index = /data/mysql_log/test_mmm/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_mmm/mysql-relay-bin
relay_log_index=/data/mysql_log/test_mmm/mysql-relay-bin.index
log_error = /data/mysql_log/test_mmm/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

Slave

[client]
port = 3306
default-character-set=utf8mb4
socket = /data/mysql_db/test_mmm/mysql.sock

[mysqld]
datadir = /data/mysql_db/test_mmm
basedir = /usr/local/mysql57
tmpdir = /tmp
socket = /data/mysql_db/test_mmm/mysql.sock
pid-file = /data/mysql_db/test_mmm/mysql.pid
skip-external-locking = 1
skip-name-resolve = 1
port = 3306
server_id = 2493306

default-storage-engine = InnoDB
character-set-server = utf8mb4
default_password_lifetime=0

read_only=1

#### log ####
log_timestamps=system
log_bin = /data/mysql_log/test_mmm/mysql-bin
log_bin_index = /data/mysql_log/test_mmm/mysql-bin.index
binlog_format = row
relay_log_recovery=ON
relay_log=/data/mysql_log/test_mmm/mysql-relay-bin
relay_log_index=/data/mysql_log/test_mmm/mysql-relay-bin.index
log_error = /data/mysql_log/test_mmm/mysql-error.log

#### replication ####
log_slave_updates = 1
replicate_wild_ignore_table = information_schema.%,performance_schema.%,sys.%

#### semi sync replication settings #####
plugin_dir=/usr/local/mysql57/lib/plugin
plugin_load = "rpl_semi_sync_master=semisync_master.so;rpl_semi_sync_slave=semisync_slave.so"
loose_rpl_semi_sync_master_enabled = 1
loose_rpl_semi_sync_slave_enabled = 1

新建用戶

在主節點中執行下列創建 MMM 用戶的命令,因爲是測試環境,密碼就設爲和帳號同樣

CREATE USER 'mmm_monitor'@'%'        IDENTIFIED BY 'mmm_monitor';
CREATE USER 'mmm_agent'@'%'          IDENTIFIED BY 'mmm_agent';
GRANT REPLICATION CLIENT                   ON *.* TO 'mmm_monitor'@'%';
GRANT SUPER, REPLICATION CLIENT, PROCESS   ON *.* TO 'mmm_agent'@'%';
FLUSH PRIVILEGES;

4 配置 MMM

4.1 配置文件

MMM 有3個配置文件,分別是 mmm_agent.conf, mmm_common.conf, mmm_mon.conf, 在目錄 /etc/mysql-mmm 下。若是區分集羣,也就是說一臺服務器跑多個 MMM,那麼配置文件能夠這樣命名 mmm_agent_cluster.conf, mmm_common_cluster.conf, mmm_mon_cluster.conf, 其中 cluster 表示集羣名稱

  • mmm_common.conf , 通用配置,在全部 MMM 節點都須要
  • mmm_agent.conf, agent 配置,在 MMM Agent 節點須要
  • mmm_mon.conf, monitor 配置,在 MMM Monitor 節點須要

此次配置,咱們把集羣名命名爲 test_mmm, 下面是具體配置

mmm_common

在全部節點新建 /etc/mysql-mmm/mmm_common_test_mmm.conf, 根據實際狀況寫上

active_master_role  writer


<host default>
    cluster_interface       eth0                                        # 羣集的網絡接口

    agent_port              9989                                        # agent 監聽端口,若是有多個 agent,須要更改默認端口
    mysql_port              3306                                        # 數據庫端口,默認爲3306

    pid_path                /var/run/mysql-mmm/mmm_agentd_test_mmm.pid  # pid路徑, 要和啓動文件對應
    bin_path               /usr/libexec/mysql-mmm                       # bin 文件路徑

    replication_user        repl                                        # 複製用戶
    replication_password    repl                                        # 複製用戶密碼

    agent_user              mmm_agent                                   # 代理用戶,用來設置 `read_only` 等
    agent_password          mmm_agent                                   # 代理用戶密碼
</host>

<host cluster01>                            # master1 的 host 名
    ip              10.0.0.247              # master1 的 ip
    mode            master                  # 角色屬性,master 表明是主節點
    peer            cluster02               # 與 master1 對等的服務器的 host 名,雙主中另外一個的主機名
</host>

<host cluster02>                            # master2 的 host 名
    ip              10.0.0.248              # master2 的 ip
    mode            master                  # 角色屬性,master 表明是主節點
    peer            cluster01               # 與 master2 對等的服務器的 host 名,雙主中另外一個的主機名
</host>

<host cluster03>                            # slave 的 host 名
    ip              10.0.0.249              # slave 的 ip
    mode            slave                   # 角色屬性,slave 表明是從節點
</host>


<role writer>                               # writer 角色配置
    hosts           cluster01, cluster02    # 能進行寫操做的服務器的 host 名
    ips             10.0.0.237              # writer 的 VIP
    mode            exclusive               # exclusive 表明只容許存在一個主節點(寫節點),也就是隻能提供一個寫的 VIP
</role>

<role reader>                                          # writer 角色配置
    hosts           cluster01, cluster02, cluster03    # 能進行讀操做的服務器的 host 名
    ips             10.0.0.238,10.0.0.239              # reader 的 VIP
    mode            balanced                           # balanced 表明負載均衡能夠多個 host 同時擁有此角色
</role>

mmm_agent

在全部 agent 的節點新建 /etc/mysql-mmm/mmm_agent_test_mmm.conf 文件,寫上如下內容

  • Cluster1
include mmm_common_test_mmm.conf  # common 文件名,對應上述寫下的文件
this cluster01  # 當前節點名稱,對應 common 文件 host 名
  • Cluster2
include mmm_common_test_mmm.conf
this cluster02
  • Cluster3
include mmm_common_test_mmm.conf
this cluster03

mmm_mon

在 monitor 節點新建 /etc/mysql-mmm/mmm_mon_test_mmm.conf 文件,寫下監控節點配置

include mmm_common_test_mmm.conf                                    # common 文件名

<monitor>
    ip               127.0.0.1                                   # 監聽 IP
    port             9992                                        # 監聽端口
    pid_path         /var/run/mysql-mmm/mmm_mond_test_mmm.pid    # PID 文件位置, 要和啓動文件對應
    bin_path         /usr/libexec/mysql-mmm                      # bin目錄
    status_path      /var/lib/mysql-mmm/mmm_mond_test_mmm.status # 狀態文件位置
    ping_ips         10.0.0.247, 10.0.0.248, 10.0.0.249          # 須要監控的主機 IP,對應 MySQL 節點 IP
    auto_set_online  30                                          # 自動恢復 online 的時間
</monitor>

<host default>
    monitor_user      mmm_monitor             # 監控用的 MySQL 帳號
    monitor_password  mmm_monitor             # 監控用的 MySQL 密碼
</host>

<check mysql>
    check_period      2       # 監控週期
    trap_period       4       # 一個節點被檢測不成功的時間持續 trap_period 秒,就認爲失去鏈接
    max_backlog       900     # 主從延遲超過這個值就會設爲 offline
</check>

debug 0                         # 是否開啓 debug 模式

PS1: 以上配置文件在使用的時候須要去掉註釋
PS2: 若是隻有一個集羣,能夠在默認配置文件上改

4.2 啓動文件

安裝成功後,會在 /etc/init.d/ 下生成配置啓動文件

[root@chengqm ~]# ls /etc/init.d/mysql*
/etc/init.d/mysqld  /etc/init.d/mysql-mmm-agent  /etc/init.d/mysql-mmm-monitor

mysql-mmm-agent

在全部 agent 節點執行

cp /etc/init.d/mysql-mmm-agent /etc/init.d/mysql-mmm-agent-test-mmm

打開 /etc/init.d/mysql-mmm-agent-test-mmm, 若是你的配置文件頭部是這樣的

CLUSTER=''


#-----------------------------------------------------------------------
# Paths
if [ "$CLUSTER" != "" ]; then
    MMM_AGENTD_BIN="/usr/sbin/mmm_agentd @$CLUSTER"
    MMM_AGENTD_PIDFILE="/var/run/mmm_agentd-$CLUSTER.pid"
else
    MMM_AGENTD_BIN="/usr/sbin/mmm_agentd"
    MMM_AGENTD_PIDFILE="/var/run/mmm_agentd.pid"
fi

echo "Daemon bin: '$MMM_AGENTD_BIN'"
echo "Daemon pid: '$MMM_AGENTD_PIDFILE'"

改成

CLUSTER='test_mmm'


#-----------------------------------------------------------------------
# Paths
if [ "$CLUSTER" != "" ]; then
    MMM_AGENTD_BIN="/usr/sbin/mmm_agentd @$CLUSTER"
    MMM_AGENTD_PIDFILE="/var/run/mysql-mmm/mmm_agentd_$CLUSTER.pid"
else
    MMM_AGENTD_BIN="/usr/sbin/mmm_agentd"
    MMM_AGENTD_PIDFILE="/var/run/mysql-mmm/mmm_agentd.pid"
fi

echo "Daemon bin: '$MMM_AGENTD_BIN'"
echo "Daemon pid: '$MMM_AGENTD_PIDFILE'"

若是打開發現是這樣的

MMMD_AGENT_BIN="/usr/sbin/mmm_agentd"
MMMD_AGENT_PIDFILE="/var/run/mysql-mmm/mmm_agentd.pid"
LOCKFILE='/var/lock/subsys/mysql-mmm-agent'
prog='MMM Agent Daemon'

改成

...
CLUSTER='test_mmm'
MMMD_AGENT_BIN="/usr/sbin/mmm_agentd @$CLUSTER"
MMMD_AGENT_PIDFILE="/var/run/mysql-mmm/mmm_agentd_$CLUSTER.pid"
LOCKFILE='/var/lock/subsys/mysql-mmm-agent_CLUSTER$'
prog='MMM Agent Daemon'

mysql-mmm-monitor

monitor 節點執行

cp /etc/init.d/mysql-mmm-monitor /etc/init.d/mysql-mmm-monitor-test-mmm

打開 /etc/init.d/mysql-mmm-monitor-test-mmm, 把文件開始部分改成

# Cluster name (it can be empty for default cases)
CLUSTER='test_mmm'
LOCKFILE="/var/lock/subsys/mysql-mmm-monitor-${CLUSTER}"
prog='MMM Monitor Daemon'

if [ "$CLUSTER" != "" ]; then
        MMMD_MON_BIN="/usr/sbin/mmm_mond @$CLUSTER"
        MMMD_MON_PIDFILE="/var/run/mysql-mmm/mmm_mond_$CLUSTER.pid"
else 
        MMMD_MON_BIN="/usr/sbin/mmm_mond"
        MMMD_MON_PIDFILE="/var/run/mysql-mmm/mmm_mond.pid"
fi

start() {
...

若是打開啓動文件發現和本文的啓動文件有出入,能夠根據實際狀況進行修改,確保啓動 monitor 命令爲 /usr/sbin/mmm_mond @$CLUSTER 且 pid 文件和配置文件一致便可

PS: 若是隻有一個集羣,能夠直接使用默認啓動文件
注意: 配置文件的 PID 文件位置要和啓動文件的 PID 文件位置要一致,若是不一致就改成一致

5 啓動 MMM

啓動 MMM 的順序是

  1. 啓動 MMM Monitor
  2. 啓動 MMM Agent

關閉 MMM 的順序則反過來執行

5.1 啓動 Monitor

monitor 節點上執行啓動命令,示例以下

[root@chengqm ~]# /etc/init.d/mysql-mmm-monitor-test-mmm start
Starting MMM Monitor Daemon:                               [  OK  ]

若是啓動有報錯查看 mmm 日誌,mmm 日誌放在 /var/log/mysql-mmm/ 目錄下

5.2 啓動 Agent

在全部 agent 節點執行啓動命令,示例以下

[root@cluster01 ~]# /etc/init.d/mysql-mmm-agent-test-mmm start
Daemon bin: '/usr/sbin/mmm_agentd @test_mmm'
Daemon pid: '/var/run/mmm_agentd-test_mmm.pid'
Starting MMM Agent daemon... Ok

5.3 觀察 mmm 狀態

monitor 節點執行 mmm_control @cluster show 命令查看各節點狀態

[root@chengqm ~]# mmm_control @test_mmm show
  cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238), writer(10.0.0.237)
  cluster02(10.0.0.248) master/ONLINE. Roles: reader(10.0.0.239)
  cluster03(10.0.0.249) slave/ONLINE. Roles:

monitor 節點執行 mmm_control @cluster checks all 命令檢測全部節點

[root@chengqm ~]# mmm_control @test_mmm checks all
cluster01  ping         [last change: 2018/12/05 20:06:35]  OK
cluster01  mysql        [last change: 2018/12/05 20:23:59]  OK
cluster01  rep_threads  [last change: 2018/12/05 20:24:14]  OK
cluster01  rep_backlog  [last change: 2018/12/05 20:24:14]  OK: Backlog is null
cluster02  ping         [last change: 2018/12/05 20:06:35]  OK
cluster02  mysql        [last change: 2018/12/05 20:23:59]  OK
cluster02  rep_threads  [last change: 2018/12/05 20:24:14]  OK
cluster02  rep_backlog  [last change: 2018/12/05 20:24:14]  OK
cluster03  ping         [last change: 2018/12/05 20:06:35]  OK
cluster03  mysql        [last change: 2018/12/05 20:23:59]  OK
cluster03  rep_threads  [last change: 2018/12/05 20:24:14]  OK
cluster03  rep_backlog  [last change: 2018/12/05 20:24:14]  OK: Backlog is null

在 Cluster1 主機查看 VIP 狀況

[root@cluster01 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.238/32 scope global eth0
    inet 10.0.0.237/32 scope global eth0
    inet6 fe80::f816:3eff:fede:8033/64 scope link 
       valid_lft forever preferred_lft forever

能夠看到 VIP 和 MMM 描述的一致

6 MMM 切換

MMM 切換有兩種方式,手動切換和自動切換

6.1 直接切換 role

相關命令: mmm_control [@cluster] move_role [writer/reader] host 給某個節點增長角色

讓咱們測試一下

  • 當前節點狀態
[root@chengqm ~]# mmm_control @test_mmm show
  cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238), writer(10.0.0.237)
  cluster02(10.0.0.248) master/ONLINE. Roles: reader(10.0.0.239)
  cluster03(10.0.0.249) slave/ONLINE. Roles:
  • Cluster1 VIP
[mysql@cluster01 ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.238/32 scope global eth0
    inet 10.0.0.237/32 scope global eth0
    inet6 fe80::f816:3eff:fede:8033/64 scope link 
       valid_lft forever preferred_lft forever
  • Master1 read_only 狀態
[mysql@cluster01 ~]$  /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show variables like 'read_only'";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | OFF   |
+---------------+-------+
  • Cluster2 VIP
[mysql@cluster02 ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.239/32 scope global eth0
    inet6 fe80::f816:3eff:fe66:7ee8/64 scope link 
       valid_lft forever preferred_lft forever
  • Master2 read_only 狀態
[mysql@cluster02 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show variables like 'read_only'";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | ON    |
+---------------+-------+
  • Slave 同步指向
[mysql@cluster03 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show slave status \G";
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.247
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
...
....

切換

執行 mmm_control @test_mmm move_role writer cluster02 切換

[root@chengqm ~]# mmm_control @test_mmm move_role writer cluster02
OK: Role 'writer' has been moved from 'cluster01' to 'cluster02'. Now you can wait some time and check new roles info!
[root@chengqm ~]# mmm_control @test_mmm show
  cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238)
  cluster02(10.0.0.248) master/ONLINE. Roles: reader(10.0.0.239), writer(10.0.0.237)
  cluster03(10.0.0.249) slave/ONLINE. Roles:
  • 切換後 cluster2 VIP
[mysql@cluster02 ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:66:7e:e8 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.248/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.239/32 scope global eth0
    inet 10.0.0.237/32 scope global eth0
    inet6 fe80::f816:3eff:fe66:7ee8/64 scope link 
       valid_lft forever preferred_lft forever
  • 切換後 Master2 read_only 狀態
[mysql@cluster02 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show variables like 'read_only'";
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| read_only     | OFF   |
+---------------+-------+
  • 切換後 Slave 同步指向
[mysql@cluster03 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show slave status \G";
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.248
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60

能夠看到切換成功

6.2 使用"上線""下線"功能切換

切換操做也能夠用如下兩個命令完成

  • mmm_control [@cluster] set_offline host 下線節點
  • mmm_control [@cluster] set_online host 上線節點

如今咱們想把寫節點從 Master2 切換到 Master1,能夠進行以下操做

mmm_control @test_mmm set_offline cluster02
mmm_control @test_mmm set_online cluster02

切換後的效果是同樣的,就不演示了

6.3 宕機自動切換

如今咱們演示一下 Master2 數據庫掛掉後自動切換狀況

  • kill master2
  • 查看 MMM monitor 日誌,看到切換過程
[root@chengqm ~]# tail -8 /var/log/mysql-mmm/mmm_mond_test_mmm.log 
2018/12/06 18:09:27  WARN Check 'rep_backlog' on 'cluster02' is in unknown state! Message: UNKNOWN: Connect error (host = 10.0.0.248:3306, user = mmm_monitor)! Lost connection to MySQL server at 'reading initial communication packet', system error: 111
2018/12/06 18:09:30 ERROR Check 'mysql' on 'cluster02' has failed for 4 seconds! Message: ERROR: Connect error (host = 10.0.0.248:3306, user = mmm_monitor)! Lost connection to MySQL server at 'reading initial communication packet', system error: 111
2018/12/06 18:09:31 FATAL State of host 'cluster02' changed from ONLINE to HARD_OFFLINE (ping: OK, mysql: not OK)
2018/12/06 18:09:31  INFO Removing all roles from host 'cluster02':
2018/12/06 18:09:31  INFO     Removed role 'reader(10.0.0.238)' from host 'cluster02'
2018/12/06 18:09:31  INFO     Removed role 'writer(10.0.0.237)' from host 'cluster02'
2018/12/06 18:09:31  INFO Orphaned role 'writer(10.0.0.237)' has been assigned to 'cluster01'
2018/12/06 18:09:31  INFO Orphaned role 'reader(10.0.0.238)' has been assigned to 'cluster01'
  • 查看節點狀態
[root@chengqm ~]# mmm_control @test_mmm show
  cluster01(10.0.0.247) master/ONLINE. Roles: reader(10.0.0.238), reader(10.0.0.239), writer(10.0.0.237)
  cluster02(10.0.0.248) master/HARD_OFFLINE. Roles: 
  cluster03(10.0.0.249) slave/ONLINE. Roles:
  • Cluster1 VIP 狀況
[mysql@cluster01 ~]$ ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether fa:16:3e:de:80:33 brd ff:ff:ff:ff:ff:ff
    inet 10.0.0.247/16 brd 10.0.255.255 scope global eth0
    inet 10.0.0.238/32 scope global eth0
    inet 10.0.0.237/32 scope global eth0
    inet6 fe80::f816:3eff:fede:8033/64 scope link 
       valid_lft forever preferred_lft forever
  • 切換後 Slave 同步指向
[mysql@cluster03 ~]$ /usr/local/mysql57/bin/mysql -S /data/mysql_db/test_mmm/mysql.sock -e "show slave status \G";
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.247
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60

能夠看到數據庫宕機後, MMM 會自動切換, 從而實現高可用

7. 總結

7.1 MMM 優勢

  1. MMM 能夠管理主備節點,並實現全節點高可用
  2. 當節點出現問題的時候自動切換,恢復後自動上線

7.2 MMM 缺點

  1. 在進行主從切換時, 容易形成數據丟失。
  2. MMM Monitor 服務存在單點故障 ,也就是說, MMM 自己不是高可用的,因此監控端要和數據庫分開部署以防數據庫和監控都出現問題

筆者在實際使用過程當中發現:

  1. 主備切換偶爾會形成從節點同步失敗(主鍵衝突、記錄不存在)
  2. 宕機切換恢復後節點有數據丟失

7.3 MMM 適用場景

  1. 對數據一致性要求不高,容許丟失少許數據,好比說評論、資訊類數據
  2. 讀操做頻繁,須要在全部節點上進行讀操做負載均衡(後續文章會說到怎麼作負載均衡)

到此, MMM 高可用架構搭建完畢

8. 附

8.1 問題及解決方案

1). 配置文件讀寫權限

  • 問題描述
FATAL Configuration file /etc/mysql-mmm/mmm_agent*.conf is world writable!
FATAL Configuration file /etc/mysql-mmm/mmm_agent*.conf is world readable!
  • 解決方案
chmod 664 /etc/mysql-mmm/*

2). 重複監聽

  • 問題描述

這個問題容易出如今多個 MMM 監控實例的狀況下, 報錯以下

FATAL Listener: Can’t create socket!
  • 解決方案
  1. 檢查配置文件端口是否衝突
  2. 檢查機器端口是否被佔用

3). 網卡配置不對

  • 問題描述
FATAL Couldn’t configure IP ‘192.168.1.202’ on interface ‘em1’: undef
  • 解決方案

ifconfig 命令查看網卡,更改配置文件

8.2 mmm 6 種狀態及變化緣由

狀態

  1. online
  2. admin_offline
  3. hard_offline
  4. awaiting_recovery
  5. replication_delay
  6. replication_fail

變化緣由:

  1. ONLINE: Host is running without any problems.
  2. ADMIN_OFFLINE: host was set to offline manually.
  3. HARD_OFFLINE: Host is offline (Check ping and/or mysql failed)
  4. AWAITING_RECOVERY: Host is awaiting recovery
  5. REPLICATION_DELAY: replication backlog is too big (Check rep_backlog failed)
  6. REPLICATION_FAIL: replication threads are not running (Check rep_threads failed)

其餘說明

  1. Only hosts with state ONLINE may have roles. When a host switches from ONLINE to any other state, all roles will be removed from it.
  2. A host that was in state REPLICATION_DELAY or REPLICATION_FAIL will be switched back to ONLINE if everything is OK again, unless it is flapping (see Flapping).
  3. A host that was in state HARD_OFFLINE will be switched to AWAITING_RECOVERY if everything is OK again. If its downtime was shorter than 60 seconds and it wasn't rebooted or auto_set_online is > 0 it will be switched back to ONLINE automatically, unless it is flapping (see Flapping again).
  4. Replication backlog or failure on the active master isn't considered to be a problem, so the active master will never be in state REPLICATION_DELAY or REPLICATION_FAIL.
  5. Replication backlog or failure will be ignored on hosts whos peers got ONLINE less than 60 seconds ago (That's the default value of master-connect-retry).
  6. If both checks rep_backlog and rep_threads fail, the state will change to REPLICATION_FAIL.
  7. If auto_set_online is > 0, flapping hosts will automatically be set to ONLINE after flap_duration seconds.
參考: mmm 官方文檔
相關文章
相關標籤/搜索