其實ceph存儲是底層的規範,應該在部署kubernetes集羣前就準備好的,使他爲k8s集羣提供存儲服務。能夠用來存儲pod,docker鏡像,日誌數據等node
Ceph 是一個分佈式存儲系統,獨一無二地用統一的系統—Ceph 存儲集羣,提供了對象存儲,塊存儲和文件存儲三種功能。Ceph 的存儲集羣基於 RADOS,提供了極大伸縮性—供成千用戶訪問 PB 乃至 EB 級的數據。 Ceph 節點以普通硬件和智能守護進程做爲支撐點, Ceph 存儲集羣組織起了大量節點,它們之間靠相互通信來複制數據、同時採用 CRUSH 算法動態地重分佈數據。
Ceph 有不少術語,瞭解這些術語,對理解 Ceph 的體系結構是很是重要的。Ceph 的常見術語。python
名詞 | 解釋 |
---|---|
RADOSGW | 對象網關守護進程 |
RBD | 塊存儲 |
CEPHFS | 文件存儲 |
LIBRADOS | 和 RADOS 交互的基本庫 librados,Ceph 經過原生協議和 RADOS 交互,Ceph 把這種功能封裝進了 librados 庫,這樣你就能定製本身的客戶端 |
RADOS | 存儲集羣 |
OSD | Object Storage Device,RADOS 的組件,用於存儲資源 |
Monitor | 監視器,RADOS 的組件,維護整個 Ceph 集羣的全局狀態 |
MDS | Ceph 元數據服務器,爲 Ceph 文件系統存儲元數據 |
Ceph分佈式存儲集羣由若干組件組成,包括:Ceph Monitor
、Ceph OSD
和Ceph MDS
,其中若是你僅使用對象存儲和塊存儲時,MDS不是必須的,僅當你要用到Cephfs時,MDS纔是須要安裝的。咱們這須要安裝MDS。算法
CephRBD是否支持多Pod同時掛載呢?官方文檔中給出了否認的答案: 基於CephRBD的Persistent Volume僅支持兩種accessmode:ReadWriteOnce和ReadOnlyMany,不支持ReadWriteMany。docker
Ceph的安裝模型與k8s有些相似,也是經過一個deploy node遠程操做其餘Node以create、prepare和activate各個Node上的Ceph組件shell
資源有限,在k8s集羣節點中部署ceph集羣,後面的hosts仍是沿用k8s集羣的,可能會有些難識別vim
節點 name | 主機名 | 節點 IP | 配置 | 說明 |
---|---|---|---|---|
ceph-mon-0 | node-01 | 172.24.10.20 | centos7.4 | 管理節點,監視器 monitor,mds |
ceph-mon-1 | node-02 | 172.24.10.21 | centos7.4 | 監視器 monitor,mds,client |
ceph-mon-2 | node-03 | 172.24.10.22 | centos7.4 | 監視器 monitor,mds |
ceph-osd-0 | node-01 | 172.24.10.20 | 20G | 存儲節點 osd |
ceph-osd-1 | node-02 | 172.24.10.21 | 20G | 存儲節點 osd |
ceph-osd-2 | node-03 | 172.24.10.22 | 20G | 存儲節點 osd |
ceph-osd-3 | node-04 | 172.24.10.23 | 20G | 存儲節點 osd |
ceph-osd-4 | node-05 | 172.24.10.24 | 20G | 存儲節點 osd |
ceph-osd-5 | node-06 | 172.24.10.25 | 20G | 存儲節點 osd |
~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.24.10.20 node-01 172.24.10.21 node-02 172.24.10.22 node-03 172.24.10.23 node-04 172.24.10.24 node-05 172.24.10.25 node-06
同時管理節點和其餘節點作好ssh-key免密碼登錄windows
~]# yum install epel-release -y && yum upgrade -y ~]# rpm -Uvh https://download.ceph.com/rpm-luminous/el7/noarch/ceph-release-1-1.el7.noarch.rpm ~]# ansible ceph -a 'rpm -Uvh https://download.ceph.com/rpm-luminous/el7/noarch/ceph-release-1-1.el7.noarch.rpm' # 批量安裝 # 官方源太慢,後面構建集羣安裝包各類超時 [Ceph] name=Ceph packages for $basearch baseurl=http://mirrors.aliyun.com/ceph/rpm-luminous/el7/$basearch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc [Ceph-noarch] name=Ceph noarch packages baseurl=http://mirrors.aliyun.com/ceph/rpm-luminous/el7/noarch enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc [ceph-source] name=Ceph source packages baseurl=http://mirrors.aliyun.com/ceph/rpm-luminous/el7/SRPMS enabled=1 gpgcheck=1 type=rpm-md gpgkey=https://download.ceph.com/keys/release.asc # ansible ceph -m copy -a 'src=/etc/yum.repos.d/ceph.repo dest=/etc/yum.repos.d/ceph.repo'
# 建立密碼 python -c "from passlib.hash import sha512_crypt; import getpass; print sha512_crypt.encrypt(getpass.getpass())" Password: ceph $6$rounds=656000$PZshbGs2TMKtUgB1$LTdZj9xxHsJH5wRNSLYQL8CH7bAaE4415g/aRZD39RJiRrPx.Bzu19Y5/aOqQuFUunr7griuDN7BAlcTOkuw81 # 本機sudo visudo Defaults:ceph timestamp_timeout=-1 ceph ALL=(root) NOPASSWD:ALL # ansible建立用戶密碼yaml ~]# vim user.yml - hosts: ceph remote_user: root tasks: - name: add user user: name=ceph password='$6$rounds=656000$PZshbGs2TMKtUgB1$LTdZj9xxHsJH5wRNSLYQL8CH7bAaE4415g/aRZD39RJiRrPx.Bzu19Y5/aOqQuFUunr7griuDN7BAlcTOkuw81' - name: sudo config copy: src=/etc/sudoers dest=/etc/sudoers - name: sync ssh key authorized_key: user=ceph state=present exclusive=yes key='{{lookup('file', '/home/ceph/.ssh/id_rsa.pub')}}' # 運行playbook ansible-playbook user.yml
在管理節點進行centos
~]$ sudo yum install ceph-deploy
~]$ mkdir ceph-cluster ~]$ cd ceph-cluster ceph-cluster]$ ceph-deploy new node-01 node-02 node-03
$ cat ceph.conf [global] fsid = 64960081-9cfe-4b6f-a9ae-eb9b2be216bc mon_initial_members = node-01, node-02, node-03 mon_host = 172.24.10.20,172.24.10.21,172.24.10.22 auth_cluster_required = cephx auth_service_required = cephx auth_client_required = cephx #更改 osd 個數 osd pool default size = 6 [mon] #容許 ceph 集羣刪除 pool mon_allow_pool_delete = true [mgr] mgr modules = dashboard
~]$ ceph-deploy install --no-adjust-repos node-01 node-02 node-03 node-04 node-05 node-06 # 不加--no-adjust-repos 會一直使用ceph-deploy提供的默認的源,很坑
初始化mon,收集全部密鑰服務器
cd ceph-cluster/ ceph-deploy mon create-initial
報錯1:app
ceph-mon-2 Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon-2.asok mon_status ceph-mon-2 admin_socket: exception getting command descriptions: [Errno 2] No such file or directory ceph_deploy.mon mon.ceph-mon-2 monitor is not yet in quorum, tries left: 1 ceph_deploy.mon waiting 20 seconds before retrying ceph_deploy.mon Some monitors have still not reached quorum: ceph_deploy.mon ceph-mon-0 ceph_deploy.mon ceph-mon-1 ceph_deploy.mon ceph-mon-2 # 查看/var/run/ceph目錄 ]$ ls /var/run/ceph/ ceph-mon.k8s-master-01.asok // 成節點的主機名的方式命名的 # 移除錯誤環境 ]$ ceph-deploy mon destroy node-01 node-02 node-03 仍是不行,主機名必須惟一
清理環境
$ ceph-deploy purge node-01 node-02 node-03 node-04 node-05 node-06 // 會移除全部與ceph相關的 $ ceph-deploy purgedata node-01 node-02 node-03 node-04 node-05 node-06 $ ceph-deploy forgetkeys
報錯2
[node-03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node-03.asok mon_status [ceph_deploy.mon][WARNIN] mon.node-03 monitor is not yet in quorum, tries left: 5 [ceph_deploy.mon][WARNIN] waiting 5 seconds before retrying [node-03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node-03.asok mon_status [ceph_deploy.mon][WARNIN] mon.node-03 monitor is not yet in quorum, tries left: 4 [ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying [node-03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node-03.asok mon_status [ceph_deploy.mon][WARNIN] mon.node-03 monitor is not yet in quorum, tries left: 3 [ceph_deploy.mon][WARNIN] waiting 10 seconds before retrying [node-03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node-03.asok mon_status [ceph_deploy.mon][WARNIN] mon.node-03 monitor is not yet in quorum, tries left: 2 [ceph_deploy.mon][WARNIN] waiting 15 seconds before retrying [node-03][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.node-03.asok mon_status [ceph_deploy.mon][WARNIN] mon.node-03 monitor is not yet in quorum, tries left: 1 [ceph_deploy.mon][WARNIN] waiting 20 seconds before retrying [ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum: [ceph_deploy.mon][ERROR ] node-02 [ceph_deploy.mon][ERROR ] node-03 [ceph_deploy.mon][ERROR ] node-01
解決
iptables 策略未經過,能夠清空規則,或者添加默認的監聽端口6789
查看啓動服務
$ ps -ef|grep ceph ceph 4693 1 0 16:45 ? 00:00:00 /usr/bin/ceph-mon -f --cluster ceph --id node-01 --setuser ceph --setgroup ceph # 手動中止方法
在管理節點上登陸到每一個 osd 節點,建立 osd 節點的數據存儲目錄(老版本)
# osd-0 ssh node-01 sudo mkdir /var/local/osd0 sudo chown -R ceph.ceph /var/local/osd0 # osd-1 ssh node-02 sudo mkdir /var/local/osd1 sudo chown -R ceph.ceph /var/local/osd1 # osd-2 ssh node-03 sudo mkdir /var/local/osd2 sudo chown -R ceph.ceph /var/local/osd2 # osd-3 ssh node-04 sudo mkdir /var/local/osd3 sudo chown -R ceph.ceph /var/local/osd3 # osd-4 ssh node-05 sudo mkdir /var/local/osd4 sudo chown -R ceph.ceph /var/local/osd4 # osd-5 ssh node-06 sudo mkdir /var/local/osd5 sudo chown -R ceph.ceph /var/local/osd5
在管理節點上執行命令,使每一個 osd 就緒(prepare)(老版本)
ceph-deploy osd prepare node-01:/var/local/osd0 node-02:/var/local/osd1 node-03:/var/local/osd2 node-04:/var/local/osd3 node-05:/var/local/osd4 node-06:/var/local/osd5 # --overwrite-conf
激活每一個osd節點(老版本)
ceph-deploy osd activate node-01:/var/local/osd0 node-02:/var/local/osd1 node-03:/var/local/osd2 node-04:/var/local/osd3 node-05:/var/local/osd4 node-06:/var/local/osd5
添加激活osd磁盤(老版本的)
ceph-deploy osd create --bluestore node-01:/var/local/osd0 node-02:/var/local/osd1 node-03:/var/local/osd2 node-04:/var/local/osd3 node-05:/var/local/osd4 node-06:/var/local/osd5
新版ceph-deploy直接使用create
至關於prepare,activate,osd create --bluestore
ceph-deploy osd create --data /dev/sdb node-01 ceph-deploy osd create --data /dev/sdb node-02 ceph-deploy osd create --data /dev/sdb node-03 ceph-deploy osd create --data /dev/sdb node-04 ceph-deploy osd create --data /dev/sdb node-05 ceph-deploy osd create --data /dev/sdb node-06
在管理節點把配置文件和 admin 密鑰拷貝到管理節點和 Ceph 節點
ceph-deploy admin node-01 node-02 node-03 node-04 node-05 node-06
在每一個節點上賦予 ceph.client.admin.keyring 有操做權限
sudo ansible ceph -a 'chmod +r /etc/ceph/ceph.client.admin.keyring'
$ ceph -s cluster: id: 64960081-9cfe-4b6f-a9ae-eb9b2be216bc health: HEALTH_WARN clock skew detected on mon.node-02, mon.node-03 services: mon: 3 daemons, quorum node-01,node-02,node-03 mgr: node-01(active), standbys: node-02, node-03 osd: 6 osds: 6 up, 6 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 bytes usage: 6337 MB used, 113 GB / 119 GB avail pgs:
health問題解決
health: HEALTH_WARN clock skew detected on mon.node-02, mon.node-03 這個是時間同步形成的 $ sudo ansible ceph -a 'yum install ntpdate -y' $ sudo ansible ceph -a 'systemctl stop ntpdate' $ sudo ansible ceph -a 'ntpdate time.windows.com' $ ceph -s cluster: id: 64960081-9cfe-4b6f-a9ae-eb9b2be216bc health: HEALTH_OK services: mon: 3 daemons, quorum node-01,node-02,node-03 mgr: node-01(active), standbys: node-03, node-02 mds: cephfs-1/1/1 up {0=node-02=up:active}, 2 up:standby osd: 6 osds: 6 up, 6 in data: pools: 2 pools, 192 pgs objects: 21 objects, 2246 bytes usage: 6354 MB used, 113 GB / 119 GB avail pgs: 192 active+clean
查看狀態
$ ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 0.11691 root default -3 0.01949 host node-01 0 hdd 0.01949 osd.0 up 1.00000 1.00000 -5 0.01949 host node-02 1 hdd 0.01949 osd.1 up 1.00000 1.00000 -7 0.01949 host node-03 2 hdd 0.01949 osd.2 up 1.00000 1.00000 -9 0.01949 host node-04 3 hdd 0.01949 osd.3 up 1.00000 1.00000 -11 0.01949 host node-05 4 hdd 0.01949 osd.4 up 1.00000 1.00000 -13 0.01949 host node-06 5 hdd 0.01949 osd.5 up 1.00000 1.00000
查看掛載
$ df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/centos-root xfs 17G 1.5G 16G 9% / devtmpfs devtmpfs 478M 0 478M 0% /dev tmpfs tmpfs 488M 0 488M 0% /dev/shm tmpfs tmpfs 488M 6.6M 482M 2% /run tmpfs tmpfs 488M 0 488M 0% /sys/fs/cgroup /dev/sda1 xfs 1014M 153M 862M 16% /boot tmpfs tmpfs 98M 0 98M 0% /run/user/0 tmpfs tmpfs 488M 48K 488M 1% /var/lib/ceph/osd/ceph-0 ]$ cat /var/lib/ceph/osd/ceph-0/type bluestore
自從ceph 12開始,manager是必須的。應該爲每一個運行monitor的機器添加一個mgr,不然集羣處於WARN狀態。
$ ceph-deploy mgr create node-01 node-02 node-03 ceph config-key put mgr/dashboard/server_addr 172.24.10.20 ceph config-key put mgr/dashboard/server_port 7000 ceph mgr module enable dashboard http://172.24.10.20:7000/
http://docs.ceph.com/docs/master/rados/operations/placement-groups/
$ ceph-deploy mds create node-01 node-02 node-03 $ ceph osd pool create cephfs_data 128 $ ceph osd pool create cephfs_metadata 64 $ ceph fs new cephfs cephfs_metadata cephfs_data $ ceph fs ls name: cephfs, metadata pool: cephfs_metadata, data pools: [cephfs_data ] $ ceph mds stat cephfs-1/1/1 up {0=node-02=up:active}, 2 up:standby 雖然支持多 active mds並行運行,但官方文檔建議保持一個active mds,其餘mds做爲standby
client是規劃在node2上的
在物理機上掛載cephfs可使用mount命令、mount.ceph(apt-get install ceph-fs-common)或ceph-fuse(apt-get install ceph-fuse),咱們先用mount命令掛載
$ sudo mkdir /data/ceph-storage/ -p $ sudo chown -R ceph.ceph /data/ceph-storage $ ceph-authtool -l /etc/ceph/ceph.client.admin.keyring [client.admin] key = AQAEKJFa54MlFRAAg76JDhpwlHD1F8J2G76baQ== $ sudo mount -t ceph 172.24.10.21:6789:/ /data/ceph-storage/ -o name=admin,secret=AQAEKJFa54MlFRAAg76JDhpwlHD1F8J2G76baQ== $ df -Th Filesystem Type Size Used Avail Use% Mounted on /dev/mapper/centos-root xfs 17G 1.5G 16G 9% / devtmpfs devtmpfs 478M 0 478M 0% /dev tmpfs tmpfs 488M 0 488M 0% /dev/shm tmpfs tmpfs 488M 6.7M 481M 2% /run tmpfs tmpfs 488M 0 488M 0% /sys/fs/cgroup /dev/sda1 xfs 1014M 153M 862M 16% /boot tmpfs tmpfs 98M 0 98M 0% /run/user/0 tmpfs tmpfs 488M 48K 488M 1% /var/lib/ceph/osd/ceph-1 tmpfs tmpfs 98M 0 98M 0% /run/user/1000 172.24.10.21:6789:/ ceph 120G 6.3G 114G 6% /data/ceph-storage