打算在跳板機上寫一個shell腳本,批量檢查遠程服務器上的main進程是否在健康運行中。html
先找出其中一臺遠程機器,查看main進程運行狀況linux
[root@two002 tmp]# ps -ef|grep main root 23448 23422 0 11:40 pts/0 00:00:00 grep --color=auto main [root@two002 tmp]# ps -ef|grep main|grep -v grep|wc -l 0
shell檢查腳本以下shell
[root@two002 tmp]# cat /tmp/main_check.sh #!/bin/bash NUM=$(ps -ef|grep main|grep -v grep|wc -l) if [ $NUM -eq 0 ];then echo "It's not good! main is stoped!" else echo "Don't worry! main is running!" fi
執行腳本bash
[root@two002 tmp]# sh -x /tmp/main_check.sh ++ grep main ++ grep -v grep ++ wc -l ++ ps -ef + NUM=2 + '[' 2 -eq 0 ']' + echo 'Don'\''t worry! main is running!' Don't worry! main is running! [root@two002 tmp]# sh /tmp/main_check.sh Don't worry! main is running!
如上執行結果,發現腳本執行過程當中,看到賦予NUM參數的結果值是2!可是手動執行ps -ef|grep main|grep -v grep|wc -l的結果明明是0!!
這是因爲grep匹配的問題,須要grep進行精準匹配,即"grep -w"。這就須要將main_check.sh腳本內容修改以下:服務器
[root@two002 tmp]# cat /tmp/main_check.sh #!/bin/bash NUM=$(ps -ef|grep -w main|grep -v grep|wc -l) if [ $NUM -eq 0 ];then echo "Oh!My God! It's broken! main is stoped!" else echo "Don't worry! main is running!" fi
再次執行檢查腳本,就OK了ssh
[root@two002 tmp]# sh -x /tmp/main_check.sh ++ grep -w main ++ grep -v grep ++ wc -l ++ ps -ef + NUM=0 + '[' 0 -eq 0 ']' + echo 'Oh!My God! It'\''s broken! main is stoped!' Oh!My God! It's broken! main is stoped! [root@two002 tmp]# sh /tmp/main_check.sh Oh!My God! It's broken! main is stoped!
故在跳板機上,批量檢查遠程服務器的main進程運行狀態的腳本爲:htm
[root@tiaoban ~]# cat /usr/bin/main_check #!/bin/bash NUM=$(ps -ef|grep -w main|grep -v grep|wc -l) if [ $NUM -eq 0 ];then echo "Oh!My God! It's broken! main is stoped!" else echo "Don't worry! main is running!" fi [root@tiaoban ~]# cat /opt/script/main_check.sh #!/bin/bash for i in $(cat /opt/ip.list) do /usr/bin/rsync -e "ssh -p22" -avpgolr /usr/bin/main_check $i:/usr/bin/ > /dev/null 2>&1 ssh -p22 root@$i "echo $i;sh /usr/bin/main_check" done