GlusterFS分佈式存儲高可用方案設計node
一、搭建GlusterFS複製卷(使用至少2個存儲節點)git
二、配置Keepalived管理GlusterFS主從存儲節點web
三、配置Keepalived浮動IP(VIP)對外提供存儲服務centos
四、實現存儲高可用(即便用兩臺GlusterFS服務器提供雙節點複製卷,並提供快速故障轉移,實現存儲的持續高可用)
bash
五、能夠應用在關鍵性數據存儲業務場景服務器
1、環境準備app
IP |
Hostname |
存儲 |
系統 |
VIP |
172.16.10.10 |
data-node-01 |
/dev/sdb1 |
CentOS7 |
172.16.10.220分佈式 說明:用於對外提供存儲服務 |
172.16.10.11 |
data-node-02 |
/dev/sdb1 |
CentOS7 |
|
172.16.10.12 |
web-node-12 |
客戶端 |
/etc/hosts配置
測試
172.16.10.10 data-node-01 172.16.10.11 data-node-02
創建GlusterFS存儲掛載點
mkdir -p /glusterfs/storage1 echo "/dev/sdb1 /glusterfs/storage1 xfs defaults 0 0" >> /etc/fstab mount -a
2、安裝GlusterFS服務端軟件
一、安裝gluster源,並安裝glusterfs及相關軟件包
yum install centos-release-gluster -y yum install glusterfs glusterfs-server glusterfs-cli glusterfs-geo-replication glusterfs-rdma -y
二、客戶端安裝GlusterFS客戶端軟件
yum install glusterfs-fuse
三、啓動Glusterd服務
systemctl start glusterd
四、在任意一個節點上添加信任節點
gluster peer probe data-node-02 gluster peer probe data-node-01 gluster peer status
五、在任意一個節點上創建複製卷
mkdir /glusterfs/storage1/rep_vol1 gluster volume create rep_vol1 replica 2 data-node-01:/glusterfs/storage1/rep_vol1 data-node-02:/glusterfs/storage1/rep_vol1
六、啓動複製卷
gluster volume start rep_vol1
七、查看複製卷狀態
gluster volume status gluster volume info
八、客戶端測試掛載複製卷
mount -t glusterfs data-node-01:rep_vol1 /data/
九、客戶端測試複製卷數據存儲
for i in `seq -w 1 3`;do cp -rp /var/log/messages /data/test-$i;done [root@localhost ~]# ls /data/ 111 1.txt 2.txt anaconda-ks.cfg test-1 test-2 test-3
3、安裝與配置Keepalived
一、安裝Keepalived
yum -y install keepalived
二、啓動keepalived服務
systemctl start keepalived
三、主節點keepalived配置
! Configuration File for keepalived global_defs { notification_email { mail@huangming.org } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id GFS_HA_MASTER vrrp_skip_check_adv_addr } vrrp_sync_group GFS_HA_GROUP { group { GFS_HA_1 } } vrrp_script monitor_glusterfs_status { script "/etc/keepalived/scripts/monitor_glusterfs_status.sh" interval 5 fall 3 rise 1 weight 20 } vrrp_instance GFS_HA_1 { state BACKUP interface ens34 virtual_router_id 107 priority 100 advert_int 2 nopreempt authentication { auth_type PASS auth_pass 11112222 } virtual_ipaddress { 172.16.10.220/24 dev ens34 } track_script { monitor_glusterfs_status } track_interface { ens34 } notify_master "/etc/keepalived/scripts/keepalived_notify.sh master" notify_backup "/etc/keepalived/scripts/keepalived_notify.sh backup" notify_fault "/etc/keepalived/scripts/keepalived_notify.sh fault" notify_stop "/etc/keepalived/scripts/keepalived_notify.sh stop" }
四、備節點keepalived配置
! Configuration File for keepalived global_defs { notification_email { mail@huangming.org } notification_email_from Alexandre.Cassen@firewall.loc smtp_server 127.0.0.1 smtp_connect_timeout 30 router_id GFS_HA_MASTER vrrp_skip_check_adv_addr } vrrp_sync_group GFS_HA_GROUP { group { GFS_HA_1 } } vrrp_script monitor_glusterfs_status { script "/etc/keepalived/scripts/monitor_glusterfs_status.sh" interval 5 fall 3 rise 1 weight 20 } vrrp_instance GFS_HA_1 { state BACKUP interface ens34 virtual_router_id 107 priority 90 advert_int 2 authentication { auth_type PASS auth_pass 11112222 } virtual_ipaddress { 172.16.10.220/24 dev ens34 } track_script { monitor_glusterfs_status } track_interface { ens34 } notify_master "/etc/keepalived/scripts/keepalived_notify.sh master" notify_backup "/etc/keepalived/scripts/keepalived_notify.sh backup" notify_fault "/etc/keepalived/scripts/keepalived_notify.sh fault" notify_stop "/etc/keepalived/scripts/keepalived_notify.sh stop" }
五、keepalived vrrp監控腳本
#!/bin/bash #check glusterfsd and glusterd process systemctl status glusterd &>/dev/null if [ $? -eq 0 ];then systemctl status glusterfsd &>/dev/null if [ $? -eq 0 ];then exit 0 else exit 2 fi else systemctl start glusterd &>/dev/null systemctl stop keepalived &>/dev/null && exit 1 fi
六、keepalived通知腳本(管理Glusterd服務)
#!/bin/bash #keepalived script for glusterd master() { systemctl status glusterd if [ $? -ne 0 ];then systemctl start glusterd else systemctl restart glusterd fi } backup() { systemctl status glusterd if [ $? -ne 0 ];then systemctl start glusterd fi } case $1 in master) master ;; backup) backup ;; fault) backup ;; stop) backup systemctl restart keepalived ;; *) echo $"Usage: $0 {master|backup|fault|stop}" esac
4、測試Keepalived自動接管GlusterFS服務及存儲的可用性
一、從新啓動keepalived服務
systemctl restart keepalived.service
二、查看VIP接管狀況
## 節點1上 [root@data-node-01 ~]# ip a show dev ens34 3: ens34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:b2:b5:2a brd ff:ff:ff:ff:ff:ff inet 172.16.10.10/24 brd 172.16.10.255 scope global ens34 valid_lft forever preferred_lft forever inet 172.16.10.220/24 scope global secondary ens34 valid_lft forever preferred_lft forever inet6 fe80::ce9a:ee2e:7b6c:a6bb/64 scope link valid_lft forever preferred_lft forever ## 節點2上 [root@data-node-02 ~]# ip a show dev ens34 3: ens34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:ba:42:cf brd ff:ff:ff:ff:ff:ff inet 172.16.10.11/24 brd 172.16.10.255 scope global ens34 valid_lft forever preferred_lft forever inet6 fe80::e23:ce0:65c3:ffbf/64 scope link valid_lft forever preferred_lft forever
三、客戶端上使用VIP掛載GlusterFS提供的複製卷,並測試是否可用
mount -t glusterfs 172.16.10.220:rep_vol1 /data/ [root@localhost ~]# ls /data/ 111 1.txt 2.txt anaconda-ks.cfg test test-1 test-2 test-3 [root@localhost ~]# mkdir /data/test [root@localhost ~]# echo 1111 >/data/test/1.txt [root@localhost ~]# ls /data/test 1.txt [root@localhost ~]# cat /data/test/1.txt 1111
查看GluserFS節點複製卷的使用狀況
[root@data-node-02 ~]# ls /glusterfs/storage1/rep_vol1/ 111 1.txt 2.txt anaconda-ks.cfg test test-1 test-2 test-3
三、測試GlusterFS服務故障轉移
將主節點(節點1)關機或重啓,查看GlusterFS服務與VIP是否轉移至節點2
[root@data-node-01 ~]# reboot [root@data-node-02 ~]# ip a show dev ens34 3: ens34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:ba:42:cf brd ff:ff:ff:ff:ff:ff inet 172.16.10.11/24 brd 172.16.10.255 scope global ens34 valid_lft forever preferred_lft forever inet 172.16.10.220/24 scope global secondary ens34 valid_lft forever preferred_lft forever inet6 fe80::e23:ce0:65c3:ffbf/64 scope link valid_lft forever preferred_lft forever [root@data-node-02 ~]# tail -f /var/log/messages Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: SECURITY VIOLATION - scripts are being executed but script_security not enabled. Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: Sync group GFS_HA_GROUP has only 1 virtual router(s) - removing Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) removing protocol VIPs. Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: Using LinkWatch kernel netlink reflector... Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) Entering BACKUP STATE Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)] Aug 27 22:56:19 data-node-02 Keepalived_vrrp[2563]: VRRP_Script(monitor_glusterfs_status) succeeded Aug 27 22:56:19 data-node-02 kernel: nf_conntrack version 0.5.0 (16384 buckets, 65536 max) Aug 27 22:56:19 data-node-02 kernel: IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP) Aug 27 22:56:19 data-node-02 kernel: IPVS: Connection hash table configured (size=4096, memory=64Kbytes) Aug 27 22:56:19 data-node-02 kernel: IPVS: Creating netns size=2040 id=0 Aug 27 22:56:19 data-node-02 kernel: IPVS: ipvs loaded. Aug 27 22:56:19 data-node-02 Keepalived_healthcheckers[2562]: Opening file '/etc/keepalived/keepalived.conf'. Aug 27 22:56:21 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) Changing effective priority from 90 to 110 Aug 27 23:01:01 data-node-02 systemd: Started Session 3 of user root. Aug 27 23:01:01 data-node-02 systemd: Starting Session 3 of user root. Aug 27 23:03:09 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) Transition to MASTER STATE Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) Entering MASTER STATE Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) setting protocol VIPs. Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) Sending/queueing gratuitous ARPs on ens34 for 172.16.10.220 Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:11 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:11 data-node-02 systemd: Stopping GlusterFS, a clustered file-system server... Aug 27 23:03:11 data-node-02 systemd: Starting GlusterFS, a clustered file-system server... Aug 27 23:03:12 data-node-02 systemd: Started GlusterFS, a clustered file-system server. Aug 27 23:03:16 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:16 data-node-02 Keepalived_vrrp[2563]: VRRP_Instance(GFS_HA_1) Sending/queueing gratuitous ARPs on ens34 for 172.16.10.220 Aug 27 23:03:16 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:16 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:16 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220 Aug 27 23:03:16 data-node-02 Keepalived_vrrp[2563]: Sending gratuitous ARP on ens34 for 172.16.10.220
在客戶端上測試存儲是否仍然可用
[root@localhost ~]# df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/cl-root xfs 40G 1.2G 39G 3% / devtmpfs devtmpfs 1.9G 0 1.9G 0% /dev tmpfs tmpfs 1.9G 0 1.9G 0% /dev/shm tmpfs tmpfs 1.9G 8.6M 1.9G 1% /run tmpfs tmpfs 1.9G 0 1.9G 0% /sys/fs/cgroup /dev/sda1 xfs 1014M 139M 876M 14% /boot tmpfs tmpfs 378M 0 378M 0% /run/user/0 172.16.10.220:rep_vol1 fuse.glusterfs 10G 136M 9.9G 2% /data [root@localhost ~]# ls /data/ 111 1.txt 2.txt anaconda-ks.cfg test test-1 test-2 test-3 [root@localhost ~]# touch /data/test.log [root@localhost ~]# ls -l /data/ total 964 drwxr-xr-x 3 root root 4096 Aug 27 21:58 111 -rw-r--r-- 1 root root 10 Aug 27 21:23 1.txt -rw-r--r-- 1 root root 6 Aug 27 21:36 2.txt -rw------- 1 root root 2135 Aug 27 21:44 anaconda-ks.cfg drwxr-xr-x 2 root root 4096 Aug 27 22:59 test -rw------- 1 root root 324951 Aug 27 21:23 test-1 -rw------- 1 root root 324951 Aug 27 21:23 test-2 -rw------- 1 root root 324951 Aug 27 21:23 test-3 -rw-r--r-- 1 root root 0 Aug 27 23:05 test.log
查看節點1狀態
[root@data-node-01 ~]# ip a show dev ens34 3: ens34: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:b2:b5:2a brd ff:ff:ff:ff:ff:ff inet 172.16.10.10/24 brd 172.16.10.255 scope global ens34 valid_lft forever preferred_lft forever inet6 fe80::ce9a:ee2e:7b6c:a6bb/64 scope link valid_lft forever preferred_lft forever
從新啓動keepalived服務
[root@data-node-01 ~]# systemctl start keepalived.service
查看keepalived日誌(主備狀態)
Aug 27 23:07:42 data-node-01 systemd: Starting LVS and VRRP High Availability Monitor... Aug 27 23:07:43 data-node-01 Keepalived[2914]: Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2 Aug 27 23:07:43 data-node-01 Keepalived[2914]: Opening file '/etc/keepalived/keepalived.conf'. Aug 27 23:07:43 data-node-01 Keepalived[2915]: Starting Healthcheck child process, pid=2916 Aug 27 23:07:43 data-node-01 systemd: Started LVS and VRRP High Availability Monitor. Aug 27 23:07:43 data-node-01 Keepalived[2915]: Starting VRRP child process, pid=2917 Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: Registering Kernel netlink reflector Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: Registering Kernel netlink command channel Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: Registering gratuitous ARP shared channel Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: Opening file '/etc/keepalived/keepalived.conf'. Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: WARNING - default user 'keepalived_script' for script execution does not exist - please create. Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: SECURITY VIOLATION - scripts are being executed but script_security not enabled. Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: Sync group GFS_HA_GROUP has only 1 virtual router(s) - removing Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: VRRP_Instance(GFS_HA_1) removing protocol VIPs. Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: Using LinkWatch kernel netlink reflector... Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: VRRP_Instance(GFS_HA_1) Entering BACKUP STATE Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: VRRP sockpool: [ifindex(3), proto(112), unicast(0), fd(10,11)] Aug 27 23:07:43 data-node-01 Keepalived_vrrp[2917]: VRRP_Script(monitor_glusterfs_status) succeeded Aug 27 23:07:43 data-node-01 kernel: nf_conntrack version 0.5.0 (16384 buckets, 65536 max) Aug 27 23:07:43 data-node-01 kernel: IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP) Aug 27 23:07:43 data-node-01 kernel: IPVS: Connection hash table configured (size=4096, memory=64Kbytes) Aug 27 23:07:43 data-node-01 kernel: IPVS: Creating netns size=2040 id=0 Aug 27 23:07:43 data-node-01 Keepalived_healthcheckers[2916]: Opening file '/etc/keepalived/keepalived.conf'. Aug 27 23:07:43 data-node-01 kernel: IPVS: ipvs loaded. Aug 27 23:07:45 data-node-01 Keepalived_vrrp[2917]: VRRP_Instance(GFS_HA_1) Changing effective priority from 100 to 120
因而可知當節點1故障恢復後,keepalived會進入到備用狀態,同時繼續監管GlusterFS服務,當節點2故障時會將服務、存儲和VIP切換到節點1,繼續對外提供存儲服務,從而實現存儲的高可用