自定義nagios插件實現主動被動模式以及nagios基於mail的簡單告警

時間 2020-01-05

標籤自定義 nagios 插件實現主動被動模式以及基於 mail 簡單告警简体版

原文原文鏈接

nagios插件程序提供兩個返回值：一個是插件的退出狀態碼，另外一個是插件在控制檯上打印的第一行數據。退出狀態碼能夠被nagios主程序html

做爲判斷被監控系統服務狀態的依據，控制檯打印的第一行數據能夠被nagios主程序做爲被監控系統服務狀態的補充說明ios

會顯示在管理頁面裏面。web

爲了管理nagios插件，nagios每查詢一個服務的狀態時，就會產生一個子進程，而且它使用來自該命令的輸出和退出狀態碼來shell

肯定具體的狀態。nagios主程序可識別的狀態碼和說明以下：vim

OK 退出代碼 0--表示服務正常的工做bash

warning 退出代碼 1--表示服務處於告警狀態curl

critical 退出代碼 2--表示服務處於緊急，嚴重狀態ide

unknown 退出代碼 3--表示服務處於未知狀態學習

[root@RS1 services]# head -7 /usr/local/nagios/libexec/utils.sh 網站

#! /bin/sh

STATE_OK=0

STATE_WARNING=1

STATE_CRITICAL=2

STATE_UNKNOWN=3

STATE_DEPENDENT=4

示例一：判斷/etc/passwd文件是否變化，利用nrpe的被動模式

原理：利用md5sum進行指紋收集 md5sum /etc/passwd > /etc/passwd.md5

利用md5sum -c /etc/passwd.md5對指紋進行判別，出現OK則沒有變化，反之則變化了

監控密碼文件是否被更改：

先作指紋庫

md5sum /etc/passwd > /etc/passwd.md5

在client上建立腳本vim /usr/local/nagios/libexec/check_passwd

#!/bin/bash

char=`md5sum -c /etc/passwd.md5 2>&1 |grep "OK"|wc -l`

if [ $char -eq 1 ];then

echo "passwd is OK"

exit 0

else

echo "passwd is changed"

exit 2

######給腳本執行權限

chmod +x /usr/local/nagios/libexec/check_passwd

#####定義check_passwd命令

vim /usr/local/nagios/etc/nrpe.cfg

command[check_passwd]=/usr/local/nagios/libexec/check_passwd

#####重啓nrpe服務

######在nagios主程序先手動抓取數據

[root@RS1 libexec]# ./check_nrpe -H 192.168.1.11 -c check_passwd

passwd is OK

######在nagios主程序上定義service配置

vim /usr/local/nagios/etc/objects/services.cfg（主動模式和被動模式各自的services.cfg配置文件，各自分別管理）

define service{

use generic-service

host_name client02

service_description check_passwd

check_command check_nrpe!check_passwd

}

而後在nagios服務端進行手動抓取數據：

/usr/local/nagios/libexec/check_nrpe -H 192.168.1.11 -c check_passwd

出現數據，代表基本已經沒有問題，重啓服務，觀察web平臺頁面，以下圖：

自定義監控web url，用主動模式監控

[root@RS1 ~]# curl -I http://192.168.1.11/index.html 2>/dev/null|grep "OK"

HTTP/1.1 200 OK

[root@RS1 ~]# curl -I http://192.168.1.11/index.html 2>/dev/null|grep "OK"|wc -l

一、編寫執行腳本

cd /usr/local/nagios/libexec

vim check_web_url

#!/bin/bash

char=`curl -I http://192.168.1.11/index.html 2>/dev/null|grep "OK"|wc -l`

if [ $char -eq 1 ];then

echo "the url is OK"

exit 0

else

echo "the url is wrong"

exit 2

chmod +x check_web_url

二、添加check_web_url這個命令到commands.cfg配置文件中

############define command check_web_url##########

define command{

command_name check_web_url

command_line $USER1$/check_web_url

}

三、編輯servers.cfg文件

cd /usr/local/nagios/etc/services

vim web_url.cfg

define service{

use generic-service

host_name client02 監控的主機192.168.1.11在hosts.cfg有定義

service_description web_url

check_period 24x7

check_interval 5

retry_interval 1

max_check_attempts 3

check_command check_web_url 由於是主動模式

notification_period 24x7

notification_interval 30

notification_options w,u,c,r

contact_groups admins

}

四、檢測錯誤，重啓服務

[root@RS1 services]# /etc/init.d/nagios checkconfig

Running configuration check...

OK.

[root@RS1 services]# /etc/init.d/nagios reload

Running configuration check...

Reloading nagios configuration...

done

成功截圖：

看下總體監控效果：

實現郵件報警功能：

配置告警的步驟：

一、添加聯繫人和聯繫組contacts.cfg

define contact{

contact_name huang

use generic-contact ---》這裏使用的模板就是模板文件中的contact定義

alias Nagios Admin

email 13817419446@139.com

}

將定義的contact_name添加到一個新組中

新增聯繫組：

define contactgroup{

contactgroup_name mail_users 這裏能夠定義郵件組，手機短信組，等等

alias Nagios Administrators

members huang

}

二、添加報警的命令commands.cfg，這裏使用默認的命令，固然你也能夠本身編寫shell腳本或者其餘語言腳本

三、調整聯繫人的默認模板

define contact{

name generic-contact

service_notification_period 24x7

host_notification_period 24x7

service_notification_options w,u,c,r,f,s

host_notification_options d,u,r,f,s

service_notification_commands notify-service-by-email

host_notification_commands notify-host-by-email 若是定義了手機，這裏能夠加上notify-host-by-email，notify-host-by-pager，這裏使用郵件告警，因此無需設置

}

四、在hosts、services配置文件中添加報警聯繫人及報警組

而後修改模板中service、host的定義，將

contact_groups admins改成

contact_groups mail_users

固然也能夠不在模板中定義，在hosts、services配置文件中各自定義不一樣的報警方式和報警組

實驗：

將網站目錄下面的index.html文件mv到tmp目錄下，使他warning並觸發告警

mv /var/www/html/index.html /tmp

能夠看見web平臺出現warning狀態，查看nagios日誌如圖：

而後查看郵件，發現沒有收到告警郵件，看日誌發現是找不到mail命令，因而

yum -y install mailx

因爲定義的services告警參數：

service_notification_options w,u,c,r,f,s，表示監控恢復正常也會觸發郵件因而將index.html從新放到網站目錄下

mv /tmp/index.html /var/www/html

稍微過幾分鐘能夠發現監控正常，查看nagios日誌

再次查看郵件，以下：

簡單基於mail告警功能實現

新建菜鳥學習交流羣：584498750

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。