ceph: HEALTH_WARN: Monitor clock skew detected

問題

ceph1:~ # ceph -s
    cluster f411aff0-1b95-4496-9310-68fa6d568903
     health HEALTH_WARN
            clock skew detected on mon.ceph1
            Monitor clock skew detected
     monmap e9: 2 mons at {ceph1=147.2.208.114:6789/0,ceph2=147.2.208.44:6789/0}
            election epoch 58, quorum 0,1 ceph2,ceph1
     osdmap e127: 3 osds: 3 up, 3 in
      pgmap v2318: 72 pgs, 2 pools, 0 bytes data, 0 objects
            105 MB used, 45941 MB / 46046 MB avail
                  72 active+cleanspa

方法一

配置ntp server, 我配置了,可是不知爲何不起做用!等一等!!!rest

後來才發現本身對ntp理解的不夠,以前配置的香港的public ntp server, 那時間能準確得嗎?code

ceph默認容忍的時間誤差不到1秒,因此要獲得更精確的時間必須使用local ntp server!ceph集羣要的不是一個準確的國際標準時間,而是集羣名節點有一個精確的時間基準。配好以後,ceph -w很快就恢復health_ok了!server

方法二

經過調整參數規避:ip

1. 在admin結點上,修改ceph.conf,添加:get

mon_clock_drift_allowed = 5
mon_clock_drift_warn_backoff = 30

這兩個參數請看:http://docs.ceph.com/docs/hammer/rados/configuration/mon-config-ref/#clockit

mon_clock_drift_allowed設置成多少合適?可參考這條消息:pip

2016-07-01 17:44:14.860902 mon.0 [WRN] mon.1 ****:6789/0 clock skew 3.0706s > max 2s

2. 執行下面命令,ceph1等是monitor結點的名稱io

ceph-deploy --overwrite-conf admin ceph1 ceph2 ceph3

3. 重啓monitorclass

systemctl restart ceph-mon@ceph1.service

3. 驗證

ceph1:~ # ceph -w
2016-07-01 18:19:08.168452 7fb98021d700  0 -- :/1003345 >> 147.2.208.73:6789/0 pipe(0x7fb96c05d370 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7fb96c0599d0).fault
    cluster f411aff0-1b95-4496-9310-68fa6d568903
     health HEALTH_OK
     monmap e9: 2 mons at {ceph1=147.2.208.114:6789/0,ceph2=147.2.208.44:6789/0}
            election epoch 68, quorum 0,1 ceph2,ceph1
     osdmap e134: 3 osds: 3 up, 3 in
      pgmap v2369: 72 pgs, 2 pools, 0 bytes data, 0 objects
            106 MB used, 45940 MB / 46046 MB avail
                  72 active+clean

2016-07-01 18:19:03.545418 mon.1 [INF] mon.ceph1 calling new monitor election
2016-07-01 18:19:18.653547 mon.0 [INF] mon.ceph2 calling new monitor election
2016-07-01 18:19:18.686790 mon.0 [INF] mon.ceph2@0 won leader election with quorum 0,1
2016-07-01 18:19:18.687641 mon.0 [INF] HEALTH_OK
相關文章
相關標籤/搜索