Openstack-Mitaka 高可用之 概述
Openstack-Mitaka 高可用之 環境初始化
Openstack-Mitaka 高可用之 Mariadb-Galera集羣部署
Openstack-Mitaka 高可用之 Rabbitmq-server 集羣部署
Openstack-Mitaka 高可用之 memcache
Openstack-Mitaka 高可用之 Pacemaker+corosync+pcs高可用集羣
Openstack-Mitaka 高可用之 認證服務(keystone)
OpenStack-Mitaka 高可用之 鏡像服務(glance)
Openstack-Mitaka 高可用之 計算服務(Nova)
Openstack-Mitaka 高可用之 網絡服務(Neutron)
Openstack-Mitaka 高可用之 Dashboard
Openstack-Mitaka 高可用之 啓動一個實例
Openstack-Mitaka 高可用之 測試html
Pacemaker:工做在資源分配層,提供資源管理器的功能
Corosync:提供集羣的信息層功能,傳遞心跳信息和集羣事務信息
Pacemaker + Corosync 就能夠實現高可用集羣架構node
如下三個節點都須要執行:mysql
# yum install pcs -y # systemctl start pcsd ; systemctl enable pcsd # echo 'hacluster' | passwd --stdin hacluster # yum install haproxy rsyslog -y # echo 'net.ipv4.ip_nonlocal_bind = 1' >> /etc/sysctl.conf # 啓動服務的時候,容許忽視VIP的存在 # echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf # 開啓內核轉發功能 # sysctl -p
在任意節點建立用於haproxy監控Mariadb的用戶redis
MariaDB [(none)]> CREATE USER 'haproxy'@'%' ;
配置haproxy用於負載均衡器sql
[root@controller1 ~]# egrep -v "^#|^$" /etc/haproxy/haproxy.cfg log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket stats socket /var/lib/haproxy/stats defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s timeout client 1m timeout server 1m timeout http-keep-alive 10s timeout check 10s maxconn 4000 listen galera_cluster mode tcp bind 192.168.0.10:3306 balance source option mysql-check user haproxy server controller1 192.168.0.11:3306 check inter 2000 rise 3 fall 3 backup server controller2 192.168.0.12:3306 check inter 2000 rise 3 fall 3 server controller3 192.168.0.13:3306 check inter 2000 rise 3 fall 3 backup listen memcache_cluster mode tcp bind 192.168.0.10:11211 balance source option tcplog server controller1 192.168.0.11:11211 check inter 2000 rise 3 fall 3 server controller2 192.168.0.12:11211 check inter 2000 rise 3 fall 3 server controller3 192.168.0.13:11211 check inter 2000 rise 3 fall 3
注意:數據庫
(1)確保haproxy配置無誤,建議首先修改ip和端口啓動測試是否成功。
(2)Mariadb-Galera和rabbitmq默認監聽到 0.0.0.0 修改調整監聽到本地 192.168.0.x
(3)將haproxy正確的配置拷貝到其餘節點,無需手動啓動haproxy服務vim
爲haproxy配置日誌(全部controller節點執行):網絡
# vim /etc/rsyslog.conf … $ModLoad imudp $UDPServerRun 514 … local2.* /var/log/haproxy/haproxy.log … # mkdir -pv /var/log/haproxy/ mkdir: created directory ‘/var/log/haproxy/’ # systemctl restart rsyslog
啓動haproxy進行驗證操做:架構
# systemctl start haproxy [root@controller1 ~]# netstat -ntplu | grep ha tcp 0 0 192.168.0.10:3306 0.0.0.0:* LISTEN 15467/haproxy tcp 0 0 192.168.0.10:11211 0.0.0.0:* LISTEN 15467/haproxy udp 0 0 0.0.0.0:43268 0.0.0.0:* 15466/haproxy 驗證成功,關閉haproxy # systemctl stop haproxy
在controller1節點上執行:負載均衡
[root@controller1 ~]# pcs cluster auth controller1 controller2 controller3 -u hacluster -p hacluster --force
controller3: Authorized
controller2: Authorized
controller1: Authorized
建立集羣:
[root@controller1 ~]# pcs cluster setup --name openstack-cluster controller1 controller2 controller3 --force Destroying cluster on nodes: controller1, controller2, controller3... controller3: Stopping Cluster (pacemaker)... controller2: Stopping Cluster (pacemaker)... controller1: Stopping Cluster (pacemaker)... controller3: Successfully destroyed cluster controller1: Successfully destroyed cluster controller2: Successfully destroyed cluster Sending 'pacemaker_remote authkey' to 'controller1', 'controller2', 'controller3' controller3: successful distribution of the file 'pacemaker_remote authkey' controller1: successful distribution of the file 'pacemaker_remote authkey' controller2: successful distribution of the file 'pacemaker_remote authkey' Sending cluster config files to the nodes... controller1: Succeeded controller2: Succeeded controller3: Succeeded Synchronizing pcsd certificates on nodes controller1, controller2, controller3... controller3: Success controller2: Success controller1: Success Restarting pcsd on the nodes in order to reload the certificates... controller3: Success controller2: Success controller1: Success
啓動集羣的全部節點:
[root@controller1 ~]# pcs cluster start --all controller2: Starting Cluster... controller1: Starting Cluster... controller3: Starting Cluster... [root@controller1 ~]# pcs cluster enable --all controller1: Cluster Enabled controller2: Cluster Enabled controller3: Cluster Enabled
查看集羣信息:
[root@controller1 ~]# pcs status Cluster name: openstack-cluster WARNING: no stonith devices and stonith-enabled is not false Stack: corosync Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Thu Nov 30 19:30:43 2017 Last change: Thu Nov 30 19:30:17 2017 by hacluster via crmd on controller3 3 nodes configured 0 resources configured Online: [ controller1 controller2 controller3 ] No resources Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@controller1 ~]# pcs cluster status Cluster Status: Stack: corosync Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Thu Nov 30 19:30:52 2017 Last change: Thu Nov 30 19:30:17 2017 by hacluster via crmd on controller3 3 nodes configured 0 resources configured PCSD Status: controller2: Online controller3: Online controller1: Online
三個節點都在線
默認的表決規則建議集羣中的節點個數爲奇數且不低於3。當集羣只有2個節點,其中1個節點崩壞,因爲不符合默認的表決規則,集羣資源不發生轉移,集羣總體仍不可用。no-quorum-policy="ignore"能夠解決此雙節點的問題,但不要用於生產環境。換句話說,生產環境仍是至少要3節點。
pe-warn-series-max、pe-input-series-max、pe-error-series-max表明日誌深度。
cluster-recheck-interval是節點從新檢查的頻率。
[root@controller1 ~]# pcs property set pe-warn-series-max=1000 pe-input-series-max=1000 pe-error-series-max=1000 cluster-recheck-interval=5min
禁用stonith:
stonith是一種可以接受指令斷電的物理設備,環境無此設備,若是不關閉該選項,執行pcs命令老是含其報錯信息。
[root@controller1 ~]# pcs property set stonith-enabled=false
二個節點時,忽略節點quorum功能:
[root@controller1 ~]# pcs property set no-quorum-policy=ignore
驗證集羣配置信息
[root@controller1 ~]# crm_verify -L -V
爲集羣配置虛擬 ip
[root@controller1 ~]# pcs resource create ClusterIP ocf:heartbeat:IPaddr2 \ ip="192.168.0.10" cidr_netmask=32 nic=eno16777736 op monitor interval=30s
到此,Pacemaker+corosync 是爲 haproxy服務的,添加haproxy資源到pacemaker集羣
[root@controller1 ~]# pcs resource create lb-haproxy systemd:haproxy --clone
說明:建立克隆資源,克隆的資源會在所有節點啓動。這裏haproxy會在三個節點自動啓動。
查看Pacemaker資源狀況
[root@controller1 ~]# pcs resource ClusterIP (ocf::heartbeat:IPaddr2): Started controller1 # 心跳的資源綁定在第三個節點的 Clone Set: lb-haproxy-clone [lb-haproxy] # haproxy克隆資源 Started: [ controller1 controller2 controller3 ]
注意:這裏必定要進行資源綁定,不然每一個節點都會啓動haproxy,形成訪問混亂
將這兩個資源綁定到同一個節點上
[root@controller1 ~]# pcs constraint colocation add lb-haproxy-clone ClusterIP INFINITY
綁定成功
[root@controller1 ~]# pcs resource ClusterIP (ocf::heartbeat:IPaddr2): Started controller3 Clone Set: lb-haproxy-clone [lb-haproxy] Started: [ controller1] Stopped: [ controller2 controller3 ]
配置資源的啓動順序,先啓動vip,而後haproxy再啓動,由於haproxy是監聽到vip
[root@controller1 ~]# pcs constraint order ClusterIP then lb-haproxy-clone
手動指定資源到某個默認節點,由於兩個資源綁定關係,移動一個資源,另外一個資源自動轉移。
[root@controller1 ~]# pcs constraint location ClusterIP prefers controller1 [root@controller1 ~]# pcs resource ClusterIP (ocf::heartbeat:IPaddr2): Started controller1 Clone Set: lb-haproxy-clone [lb-haproxy] Started: [ controller1 ] Stopped: [ controller2 controller3 ] [root@controller1 ~]# pcs resource defaults resource-stickiness=100 # 設置資源粘性,防止自動切回形成集羣不穩定 如今vip已經綁定到controller1節點 [root@controller1 ~]# ip a | grep global inet 192.168.0.11/24 brd 192.168.0.255 scope global eno16777736 inet 192.168.0.10/32 brd 192.168.0.255 scope global eno16777736 inet 192.168.118.11/24 brd 192.168.118.255 scope global eno33554992
嘗試經過vip鏈接數據庫
Controller1:
[root@controller1 haproxy]# mysql -ugalera -pgalera -h 192.168.0.10
Controller2:
高可用配置成功。
測試高可用是否正常
在controller1節點上直接執行 poweroff -f
[root@controller1 ~]# poweroff -f
vip很快就轉移到controller2節點上
再次嘗試訪問數據庫
無任何問題,測試成功。
查看集羣信息:
[root@controller2 ~]# pcs status Cluster name: openstack-cluster Stack: corosync Current DC: controller3 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Thu Nov 30 23:57:28 2017 Last change: Thu Nov 30 23:54:11 2017 by root via crm_attribute on controller1 3 nodes configured 4 resources configured Online: [ controller2 controller3 ] OFFLINE: [ controller1 ] # controller1 已經下線 Full list of resources: ClusterIP (ocf::heartbeat:IPaddr2): Started controller2 Clone Set: lb-haproxy-clone [lb-haproxy] Started: [ controller2 ] Stopped: [ controller1 controller3 ] Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled