ceph集羣的容器化安裝比原始安裝步驟簡單,好操做。可是運維相對可能會有許多坑須要趟。下面就作一個簡單的分享。node
1 節點規劃docker
admin 172.18.1.193bootstrap
node1 172.18.1.195centos
node2 172.18.1.196bash
2 ceph的jewel版本進行拉取app
[root@admin osd2]# docker pull tag-build-master-jewel-centos-7 [root@admin osd2]# docker images|grep jew docker.io/ceph/daemon tag-build-master-jewel-centos-7 74723dc740be 7 weeks ago 677.5 MB
3 在admin節點上容許mon容器運維
[root@admin osd2]# docker run -d --net=host --name=mon -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -e MON_IP=172.18.1.193 -e CEPH_PUBLIC_NETWORK=172.18.1.0/24 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 mon 3a12bbd4b81cfb2e3a234f08fb2c751ffa4e3482a2602733b5d4f7e8b934782e # mon容器運行起來了 [root@admin osd2]# docker ps|grep mon 3a12bbd4b81c docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh mon" 29 seconds ago Up 28 seconds mon # 檢查一下mon容器運行日誌(通常不會有問題,這裏就不所有展現了) [root@admin osd2]# docker logs mon creating /etc/ceph/ceph.client.admin.keyring creating /etc/ceph/ceph.mon.keyring creating /var/lib/ceph/bootstrap-osd/ceph.keyring creating /var/lib/ceph/bootstrap-mds/ceph.keyring creating /var/lib/ceph/bootstrap-rgw/ceph.keyring
4 拷貝配置文件和系統文件到其餘兩個節點,再用一樣的方法在其餘節點上啓動mon測試
這一步很是重要。若是沒有 拷貝admin節點安裝mon後生產的配置文件和系統文件到其餘節點,就開始在其餘節點啓動mon則三個節點會單獨啓動3個ceph集羣,而不是一個集羣的三個mon節點ui
[root@admin osd2]# scp -r /etc/ceph /node1:/etc/ cp: cannot create directory ‘/node1:/etc/’: No such file or directory [root@admin osd2]# scp -r /etc/ceph node1:/etc/ root@node1's password: ceph.conf 100% 212 0.2KB/s 00:00 ceph.client.admin.keyring 100% 159 0.2KB/s 00:00 ceph.mon.keyring [root@admin osd2]# scp -r /etc/ceph node2:/etc/ root@node2's password: ceph.conf 100% 212 0.2KB/s 00:00 ceph.client.admin.keyring 100% 159 0.2KB/s 00:00 ceph.mon.keyring 100% 575 0.6KB/s 00:00 100% 575 0.6KB/s 00:00 # 檢查一下 [root@node1 osd4]# cd /etc/ceph/ [root@node1 ceph]# ll total 12 -rw------- 1 root root 159 Apr 16 19:14 ceph.client.admin.keyring -rw-r--r-- 1 root root 212 Apr 16 19:14 ceph.conf -rw------- 1 root root 575 Apr 16 19:14 ceph.mon.keyring root@node2 /]# cd /etc/ceph/ [root@node2 ceph]# ll total 12 -rw------- 1 root root 159 Apr 16 19:15 ceph.client.admin.keyring -rw-r--r-- 1 root root 212 Apr 16 19:15 ceph.conf -rw------- 1 root root 575 Apr 16 19:15 ceph.mon.keyring
[root@admin osd2]# scp -r /var/lib/ceph node1:/var/lib/ root@node1's password: ceph.keyring 100% 113 0.1KB/s 00:00 ceph.keyring 100% 113 0.1KB/s 00:00 ceph.keyring 100% 113 0.1KB/s 00:00 LOCK 100% 0 0.0KB/s 00:00 CURRENT 100% 16 0.0KB/s 00:00 000005.sst 100% 1080 1.1KB/s 00:00 000006.log 100% 192KB 192.0KB/s 00:00 MANIFEST-000004 100% 64KB 64.0KB/s 00:00 keyring 100% 77 0.1KB/s 00:00 [root@admin osd2]# scp -r /var/lib/ceph node2:/var/lib/ root@node2's password: ceph.keyring 100% 113 0.1KB/s 00:00 ceph.keyring 100% 113 0.1KB/s 00:00 ceph.keyring 100% 113 0.1KB/s 00:00 LOCK 100% 0 0.0KB/s 00:00 CURRENT 100% 16 0.0KB/s 00:00 000005.sst 100% 1080 1.1KB/s 00:00 000006.log 100% 192KB 192.0KB/s 00:00 MANIFEST-000004 100% 64KB 64.0KB/s 00:00 keyring 100% 77 0.1KB/s 00:00 # 檢查一下 [root@node1 ceph]# cd /var/lib/ceph/ [root@node1 ceph]# ll total 0 drwxr-xr-x 2 root root 25 Apr 16 19:17 bootstrap-mds drwxr-xr-x 2 root root 25 Apr 16 19:17 bootstrap-osd drwxr-xr-x 2 root root 25 Apr 16 19:17 bootstrap-rgw drwxr-xr-x 3 root root 23 Apr 16 19:17 mds drwxr-xr-x 3 root root 23 Apr 16 19:17 mon drwxr-xr-x 2 root root 6 Apr 16 19:17 osd drwxr-xr-x 3 root root 27 Apr 16 19:17 radosgw drwxr-xr-x 3 root root 27 Apr 16 19:17 tmp [root@node2 ceph]# cd /var/lib/ceph/ [root@node2 ceph]# ll total 0 drwxr-xr-x 2 root root 25 Apr 16 19:17 bootstrap-mds drwxr-xr-x 2 root root 25 Apr 16 19:17 bootstrap-osd drwxr-xr-x 2 root root 25 Apr 16 19:17 bootstrap-rgw drwxr-xr-x 3 root root 23 Apr 16 19:17 mds drwxr-xr-x 3 root root 23 Apr 16 19:17 mon drwxr-xr-x 2 root root 6 Apr 16 19:17 osd drwxr-xr-x 3 root root 27 Apr 16 19:17 radosgw drwxr-xr-x 3 root root 27 Apr 16 19:17 tmp
在其餘節點上啓動mon,啓動命令都同樣只是修改一下mon_id 的IP地址3d
[root@node1 ceph]# docker run -d --net=host --name=mon -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -e MON_IP=172.18.1.195 -e CEPH_PUBLIC_NETWORK=172.18.1.0/24 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 mon 632e06c5735c927c80a974d84627184798f0e0becd78a87b20668dd07c024876 [root@node1 ceph]# dokcer ps|grep mon -bash: dokcer: command not found [root@node1 ceph]# docker ps|grep mon 632e06c5735c docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh mon" About a minute ago Up About a minute mon
[root@node2 ceph]# docker run -d --net=host --name=mon -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -e MON_IP=172.18.1.196 -e CEPH_PUBLIC_NETWORK=172.18.1.0/24 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 mon 1f42acaf33b0fe499f3e8d560b58080971648bf3d50e441cdb9b35189b40390d [root@node2 ceph]# docker ps|grep mon 1f42acaf33b0 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh mon" 33 seconds ago Up 31 seconds mon
檢查ceph監控狀態
[root@admin osd2]# docker exec mon ceph -s cluster d4ec799c-1f54-4441-b19c-cd14a6a8710b health HEALTH_ERR clock skew detected on mon.node2 64 pgs are stuck inactive for more than 300 seconds 64 pgs stuck inactive 64 pgs stuck unclean no osds Monitor clock skew detected monmap e3: 3 mons at {admin=172.18.1.193:6789/0,node1=172.18.1.195:6789/0,node2=172.18.1.196:6789/0} election epoch 6, quorum 0,1,2 admin,node1,node2 osdmap e1: 0 osds: 0 up, 0 in flags sortbitwise,require_jewel_osds pgmap v2: 64 pgs, 1 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 64 creating
5 掛載osd
先準備好三個節點上的osd磁盤(分區,格式化,mount,或者lvm)這裏就不作詳細介紹了
[root@admin osd2]# df -Ph|grep osd /dev/mapper/ceph-osd1 15G 33M 15G 1% /osd1 /dev/mapper/ceph-osd2 15G 33M 15G 1% /osd2 [root@node1 /]# df -Ph|grep osd /dev/mapper/ceph-osd3 15G 33M 15G 1% /osd3 /dev/mapper/ceph-osd4 15G 33M 15G 1% /osd4 [root@node2 /]# df -Ph|grep osd /dev/mapper/ceph-osd5 15G 33M 15G 1% /osd5 /dev/mapper/ceph-osd6 15G 33M 15G 1% /osd6
分別在三個節點上掛載osd
#osd1 [root@admin osd2]# docker run -d --net=host --name=osd1 -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev:/dev -v /osd1:/var/lib/ceph/osd --privileged=true docker.io/ceph/daemon:tag-build-master-jewel-centos-7 osd_directory 5e241e4fa243d5154ee9ac0982c4790d29f6df0ff820be76a812d676e134fc2c #osd2 [root@admin osd2]# docker run -d --net=host --name=osd2 -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev:/dev -v /osd2:/var/lib/ceph/osd --privileged=true docker.io/ceph/daemon:tag-build-master-jewel-centos-7 osd_directory cd857e4cd32b0c745a16654df302732cd1a4e54de26fe43753f8b5e1aae43829 osd1 #檢查osd1 和osd2 [root@admin osd2]# docker ps|grep osd cd857e4cd32b docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh osd_d" About a minute ago Up About a minute osd2 5e241e4fa243 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh osd_d" 2 minutes ago Up 2 minutes osd1 #osd3 [root@node1 /]# docker run -d --net=host --name=osd3 -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev:/dev -v /osd3:/var/lib/ceph/osd --privileged=true docker.io/ceph/daemon:tag-build-master-jewel-centos-7 osd_directory 5b895ce07e313111c2a023ac8755cc6a0bee9bc36640b71d69436e6fcf555636 #osd4 [root@node1 /]# docker run -d --net=host --name=osd4 -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev:/dev -v /osd4:/var/lib/ceph/osd --privileged=true docker.io/ceph/daemon:tag-build-master-jewel-centos-7 osd_directory 2e14db1a3049b57573fdc008a3f0051a02a1eacc10c5495ad996eebd51ad843c osd3 #檢查osd3和osd4 [root@node1 /]# docker ps|grep osd 2e14db1a3049 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh osd_d" 8 seconds ago Up 7 seconds osd4 5b895ce07e31 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh osd_d" 33 seconds ago Up 33 seconds osd3 #osd5 [root@node2 /]# docker run -d --net=host --name=osd5 -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev:/dev -v /osd5:/var/lib/ceph/osd --privileged=true docker.io/ceph/daemon:tag-build-master-jewel-centos-7 osd_directory 7af739ebd117f07db85ea2f281054259b1bb31b19ca21238e9395675c0bbf56c #osd6 [root@node2 /]# docker run -d --net=host --name=osd6 -v /etc/ceph:/etc/ceph -v /var/lib/ceph:/var/lib/ceph -v /dev:/dev -v /osd6:/var/lib/ceph/osd --privileged=true docker.io/ceph/daemon:tag-build-master-jewel-centos-7 osd_directory 79cf854b8f30d129ee5c61ff892b8c2e89df29b38adfa111cdebb90933c71c32 # 檢查osd5和osd6 [root@node2 /]# docker ps|grep osd 79cf854b8f30 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh osd_d" 22 seconds ago Up 18 seconds osd6 7af739ebd117 docker.io/ceph/daemon:tag-build-master-jewel-centos-7 "/entrypoint.sh osd_d" 43 seconds ago Up 41 seconds osd5
檢查ceph 健康情況和osd tree
[root@admin osd2]# docker exec mon ceph -s cluster d4ec799c-1f54-4441-b19c-cd14a6a8710b health HEALTH_ERR clock skew detected on mon.node2 64 pgs are stuck inactive for more than 300 seconds 64 pgs stuck inactive 64 pgs stuck unclean too few PGs per OSD (10 < min 30) Monitor clock skew detected monmap e3: 3 mons at {admin=172.18.1.193:6789/0,node1=172.18.1.195:6789/0,node2=172.18.1.196:6789/0} election epoch 6, quorum 0,1,2 admin,node1,node2 osdmap e13: 6 osds: 6 up, 6 in flags sortbitwise,require_jewel_osds pgmap v46: 64 pgs, 1 pools, 0 bytes data, 0 objects 794 MB used, 90905 MB / 91700 MB avail 64 creating # 上面顯示64個pg一直處於creating狀態,由於默認的rbd存儲池就是有64個pg,可是因爲osd異常到時pg沒法建立成功 [root@admin osd2]# docker exec mon ceph osd pool get rbd pg_num pg_num: 64
[root@admin osd2]# docker exec mon ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -1 0 root default 0 0 osd.0 up 1.00000 1.00000 1 0 osd.1 up 1.00000 1.00000 2 0 osd.2 up 1.00000 1.00000 3 0 osd.3 up 1.00000 1.00000 4 0 osd.4 up 1.00000 1.00000 5 0 osd.5 up 1.00000 1.00000 # 上面顯示全部osd都已經成功掛載。可是權重都是0,且都沒有歸屬到某個節點下。應該是osd掛載後crushmap 沒有生成
6 手動修復crushmap
6.1 添加crushmap
[root@admin osd2]# docker exec mon ceph osd pool get rbd pg_num pg_num: 64 [root@admin osd2]# docker exec mon ceph osd crush add osd.0 0.15 host=admin add item id 0 name 'osd.0' weight 0.15 at location {host=admin} to crush map [root@admin osd2]# docker exec mon ceph osd crush add osd.1 0.15 host=admin add item id 1 name 'osd.1' weight 0.15 at location {host=admin} to crush map [root@admin osd2]# docker exec mon ceph osd crush add osd.2 0.15 host=node1 add item id 2 name 'osd.2' weight 0.15 at location {host=node1} to crush map [root@admin osd2]# docker exec mon ceph osd crush add osd.3 0.15 host=node1 add item id 3 name 'osd.3' weight 0.15 at location {host=node1} to crush map [root@admin osd2]# docker exec mon ceph osd crush add osd.4 0.15 host=node2 add item id 4 name 'osd.4' weight 0.15 at location {host=node2} to crush map [root@admin osd2]# docker exec mon ceph osd crush add osd.5 0.15 host=node2 add item id 5 name 'osd.5' weight 0.15 at location {host=node2} to crush map
[root@admin osd2]# docker exec mon ceph osd tree ID WEIGHT TYPE NAME UP/DOWN REWEIGHT PRIMARY-AFFINITY -4 0.29999 host node2 4 0.14999 osd.4 up 1.00000 1.00000 5 0.14999 osd.5 up 1.00000 1.00000 -3 0.29999 host node1 2 0.14999 osd.2 up 1.00000 1.00000 3 0.14999 osd.3 up 1.00000 1.00000 -2 0.29999 host admin 0 0.14999 osd.0 up 1.00000 1.00000 1 0.14999 osd.1 up 1.00000 1.00000 -1 0 root default # crushmap是有了,osd權重也有了,可是三個節點沒有歸屬於root default下
6.2 更新crushmap使得三個節點都歸屬於root default
[root@admin osd2]# docker exec mon ceph osd crush move admin root=default moved item id -2 name 'admin' to location {root=default} in crush map [root@admin osd2]# docker exec mon ceph osd crush move node1 root=default moved item id -3 name 'node1' to location {root=default} in crush map [root@admin osd2]# docker exec mon ceph osd crush move node2 root=default moved item id -4 name 'node2' to location {root=default} in crush map
檢查ceph 監控情況
[root@admin osd2]# docker exec mon ceph -s cluster d4ec799c-1f54-4441-b19c-cd14a6a8710b health HEALTH_ERR clock skew detected on mon.node2 4 pgs are stuck inactive for more than 300 seconds 5 pgs degraded 23 pgs peering 4 pgs stuck inactive 57 pgs stuck unclean 5 pgs undersized Monitor clock skew detected monmap e3: 3 mons at {admin=172.18.1.193:6789/0,node1=172.18.1.195:6789/0,node2=172.18.1.196:6789/0} election epoch 6, quorum 0,1,2 admin,node1,node2 osdmap e29: 6 osds: 6 up, 6 in flags sortbitwise,require_jewel_osds pgmap v99: 64 pgs, 1 pools, 0 bytes data, 0 objects 799 MB used, 90900 MB / 91700 MB avail 29 activating 15 remapped+peering 8 peering 7 active+clean 5 active+undersized+degraded [root@admin osd2]# docker exec mon ceph -s cluster d4ec799c-1f54-4441-b19c-cd14a6a8710b health HEALTH_WARN clock skew detected on mon.node2 16 pgs stuck unclean Monitor clock skew detected monmap e3: 3 mons at {admin=172.18.1.193:6789/0,node1=172.18.1.195:6789/0,node2=172.18.1.196:6789/0} election epoch 6, quorum 0,1,2 admin,node1,node2 osdmap e29: 6 osds: 6 up, 6 in flags sortbitwise,require_jewel_osds pgmap v104: 64 pgs, 1 pools, 0 bytes data, 0 objects 799 MB used, 90900 MB / 91700 MB avail 48 active+clean 16 activating [root@admin osd2]# docker exec mon ceph -s cluster d4ec799c-1f54-4441-b19c-cd14a6a8710b health HEALTH_WARN clock skew detected on mon.node2 Monitor clock skew detected monmap e3: 3 mons at {admin=172.18.1.193:6789/0,node1=172.18.1.195:6789/0,node2=172.18.1.196:6789/0} election epoch 6, quorum 0,1,2 admin,node1,node2 osdmap e29: 6 osds: 6 up, 6 in flags sortbitwise,require_jewel_osds pgmap v108: 64 pgs, 1 pools, 0 bytes data, 0 objects 799 MB used, 90900 MB / 91700 MB avail 64 active+clean
能夠看到crushmap修復後pg已經能夠正常建立和訪問了。ceph的健康狀態也好了。
7 測試ceph集羣
測試ceph集羣在塊存儲下鏡像的建立和文件的上傳,若是成功才能說明ceph集羣安裝成功。
[root@admin osd2]# docker exec mon rbd create rbd/test-image --size 100M [root@admin osd2]# docker exec mon rbd ls rbd test-image [root@admin osd2]# docker exec mon rbd info rbd/test-image rbd image 'test-image': size 102400 kB in 25 objects order 22 (4096 kB objects) block_name_prefix: rbd_data.1083238e1f29 format: 2 features: layering, exclusive-lock, object-map, fast-diff, deep-flatten flags: #塊設備鏡像建立成功
[root@admin osd2]# docker exec mon rados -p rbd put wzl /etc/fstab [root@admin osd2]# docker exec mon rados -p rbd ls rbd_header.1083238e1f29 wzl rbd_object_map.1083238e1f29 rbd_directory rbd_id.test-image # 文件上傳成功
到此爲止利用docker容器安裝ceph集羣成功。後續的