近期又被本身造的keepalive檢測機制搞了,直接上最新優化後的腳本: ############start scripts killall -0 redis-server if [ "$?" -eq 0 ]; then echo good exit 0 else LOGFILE=/var/log/keepalived-redis-state.log echo "[check_fail_log]" >> $LOGFILE date >> $LOGFILE killall -0 redis-server if [ "$?" -ne 0 ]; then echo killall_bad_two >> /var/log/keepalived-redis-state.log numprocess1=`netstat -tnpl | grep 6379 -c` numprocess2=`ps -ef | grep redis-server | grep -c 6379` if [[ ${numprocess1} -lt 1 ]] && [[ ${numprocess2} -lt 1 ]]; then echo process_bad_three >> /var/log/keepalived-redis-state.log #start judge status /etc/init.d/redis status if [[ $? -eq 0 ]]; then echo good exit 0 else echo status_bad_four >> /var/log/keepalived-redis-state.log /etc/init.d/redis status >> /var/log/keepalived-redis-state.log ####start last judge ALIVE=`/usr/bin/redis-cli -p 6379 PING` if [ "$ALIVE" != "PONG" ]; then echo ping_bad_five >> /var/log/keepalived-redis-state.log exit 1 else exit 0 fi ###end last judge fi #end judge status else exit 0 fi else exit 0 fi fi ############end scripts 目前的判斷爲: 第一層: 重試兩次 進程信號判斷。 第二層: netstat 及 ps -ef 判斷 第三層: /etc/init.d/redis status 返回碼判斷。 第四層: redis ping 返回碼判斷。 以上每一層失敗,都會打印日誌。失敗後說明redis進程及業務訪問出現問題,正式開始啓動切換操做。 測試環境測試效果。 日誌記錄: [check_fail_log] Thu May 19 09:46:23 CST 2016 killall_bad_two process_bad_three status_bad_four redis-server is stopped ping_bad_five