【原創】大叔問題定位分享(30)mesos agent啓動失敗:Failed to perform recovery: Incompatible agent info detected

mesos agent啓動失敗,報錯以下:this

Feb 15 22:03:18 server1.bj mesos-slave[1190]: E0215 22:03:18.622994 1192 slave.cpp:7311] EXIT with status 1: Failed to perform recovery: Incompatible agent info detected.
...
Feb 15 22:03:18 server1.bj mesos-slave[1190]: ------------------------------------------------------------
Feb 15 22:03:18 server1.bj mesos-slave[1190]: Old agent info:
Feb 15 22:03:18 server1.bj mesos-slave[1190]: hostname: "server1"
...
Feb 15 22:03:18 server1.bj mesos-slave[1190]: ------------------------------------------------------------
Feb 15 22:03:18 server1.bj mesos-slave[1190]: New agent info:
Feb 15 22:03:18 server1.bj mesos-slave[1190]: hostname: "server1.bj"rest

經過日誌發現是由於hostname有了變化,這是由於修改hosts文件致使的日誌

# cat /etc/hosts
192.168.0.1 server1 server1.bj
->
192.168.0.1 server1.bj server1orm

解決方法也提示出來了server

Feb 15 22:03:18 server1.bj mesos-slave[1190]: If recovery failed due to a change in configuration and you want to
Feb 15 22:03:18 server1.bj mesos-slave[1190]: keep the current agent id, you might want to change the
Feb 15 22:03:18 server1.bj mesos-slave[1190]: `--reconfiguration_policy` flag to a more permissive value.
Feb 15 22:03:18 server1.bj mesos-slave[1190]:
Feb 15 22:03:18 server1.bj mesos-slave[1190]: To restart this agent with a new agent id instead, do as follows:
Feb 15 22:03:18 server1.bj mesos-slave[1190]: rm -f /var/lib/mesos/meta/slaves/latest
Feb 15 22:03:18 server1.bj mesos-slave[1190]: This ensures that the agent does not recover old live executors.it

mesos agent保存一個slave.info,其中包含hostname,若是hostname有變化,即和slave.info中不同,就會報錯io

# cat /var/lib/mesos/meta/slaves/latest/slave.info
¥
server1
cpus @2*
mem ̀2*
disk  ~ᄇ*
ports"
↑2)form

修復test

# rm -f /var/lib/mesos/meta/slaves/latest
# service mesos-slave startservice

相關文章
相關標籤/搜索