Nagios各組件簡述及nrpe詳解

時間 2019-12-09

標籤 nagios 組件簡述 nrpe 詳解简体版

原文原文鏈接

Nagios 各組件簡述及 nrpe 詳解

一． Nagios 各組件簡述

因爲nagios配置較爲繁鎖，且裏面組件也較多，這裏我將幾個關鍵的組件列舉一下，且作一下簡單的介紹及其和其它組件間的關係的描述。我本身的一些理解，若有誤差，歡迎指正！

我在部署過程當中主要用到了如下組件： nagios-3.2.3.tar.gz，nagios-plugins-1.4.15.tar.gz，ndoutils-1.4b7.tar.gz，nrpe-2.12.tar.gz。

這些組件都是幹什麼的呢？

1． nagios-3.2.3.tar.gz是nagios的主要組件，裏面包括了各類配置文件；

2． nagios-plugins-1.4.15.tar.gz是nagios的插件，裏面提供了各類監控模板及監控命令，如check_tcp等等有不少經常使用的監控對象均可以使用這些模式，固然也能夠本身編寫腳原本實現，這一點上nagios是很是靈活的；

3． ndoutils-1.4b7.tar.gz，利用它將nagios的監控信息存入mysql數據庫；

4． nrpe-2.12.tar.gz是一款用來監控被控端主機資源的工具，沒有它，nagios將沒法對被控端服務器的主機資源進行監控！

以上是一些主要的組件，還有一些比較重要的組件，如：NSClient-0.3.8-Win32.zip(被控端爲win操做系統時要安裝)，npc （主要用於cacti與nagios整合時，可用於將nagios的監控數據導給cacti）

關係也大體屢清了，上文講過部署nagios,本文將不在嫯述了，下面將nrpe的部署過程詳細整理一下！

二． Nrpe 詳解

1.先用表格列舉一下個人監控對象和閥值：

監控對象		監控閥值
主機資源	主機存活： check_ping	-w 3000.0,80% -c 5000.0,100% -p 5(3000毫秒響應時間內，丟包率超過80%報警告，5000毫秒響應時間內，丟包率超過 100%報危急，一共發送5個包）
	登陸用戶： check_user	-w 5 -c 10(w爲警告，c爲危急)
	系統負載： check_load	-w 15,10,5 -c 30,25,20(1分鐘，5分鐘，15分鐘大於對應的等待進程數則警告或危急)
	磁盤佔用率： check_disk	-w 20% -c 10% -p /（根分區剩餘空間爲總大小的20%警告， 10%危急，-p後是根分區）
	腳本檢測磁盤I/O： check_iostat	-w 5 –c 10 (磁盤I/O的iowait超過5%報警告,超過10%報危急)
	檢測殭屍進程： check_zombie _procs	-w 5 -c 10 -s Z（有5個殭屍進程報警告，10個報危急）
	檢測總進程數： check_total_procs	-w 150 -c 200（總進程到150個警告，200個報危急）
	腳本檢測內存剩餘： check_mem	-w 90% -c 95%(內存空閒率90%以上報警告，95%以上報危急)
	檢測交換分區使用率： check_swap	-w 20% -c 10%（交換分區剩餘空間爲總大小的20%警告， 10%危急）
應用服務監控	監控服務端口： check_tcp	-H localhost2 -p 80(主機與對應的端口號)
	監控頁面響應時間： check_http	-H localhost2 -u http:\/\/localhost2/test.jsp –w 5 –c 10(檢查頁面，超過5s報警告，超過10s報危急)
	腳本檢測IP鏈接數： check_ips	-w 200 –c 250(IP鏈接數超過200報警告，超過250報危急)
流量監控	監控server流量: Check_traffic	-V 2c -C public -H localhost2 -I 2 -w 12,30 -c 15,35 -M –b(snmp版本,用戶,主機,對應網卡,警告閥值,危急閥值)

數據庫的監控之後再補上！

2.安裝過程

1）主控端

主控端上也要安裝 nrpe,由於須要它的check_nrpe來監控遠程主機：

tar zxf nrpe-2.12.tar.gz

cd nrpe-2.12

./configure

make all

make install-plugin

只運行這一步就好了,由於只須要check_nrpe 插件。

在主控端的vim /usr/local/nagios/etc/object/commands.cfg中添加：

#################################################################

# 'check_nrpe ' command definition

define command{

command_name check_nrpe

command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$

}

##################################################################

2）被控端

被控端上首先要安裝nagios-plugins-1.4.15.tar.gz，再安裝nrpe-2.12.tar.gz。

增長用戶：

useradd nagios

安裝nagios插件：

tar fvxz nagios-plugins-1.4.15.tar.gz

./configure --with-nagios-user=nagios --with-nagios-group=nagios --enable-redhat-pthread-workaround --prefix=/usr/local/nagios

make

make install

chown -R nagios.nagios /usr/local/nagios

安裝nrpe：

tar fvxz nrpe-2.12.tar.gz

cd nrpe-2.12

./configure

make all

make install-plugin

make install-daemon

make install-daemon-config

找到vim /usr/local/nagios/etc/nrpe.cfg

裏面有一些默認的模板了：

# The following examples use hardcoded command arguments...

command[check_users]=/opt/nagios/libexec/check_users -w 5 -c 10

command[check_load]=/opt/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_hda1]=/opt/nagios/libexec/check_disk -w 20 -c 10 -p /dev/hda1

command[check_zombie_procs]=/opt/nagios/libexec/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/opt/nagios/libexec/check_procs -w 150 -c 200

這些命令是由主控端 check_nrpe來執行來遠程監控主機資源！咱們能夠修改這些選項，還能夠添加一些本身想監控的東西，好比本身寫的腳本等！

下面是我修改後的配置，只簡單列了下，以供參考：

# The following examples use hardcoded command arguments...

command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10

command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20

command[check_hda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/hda1

command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z

command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200

command[check_swap]=/usr/local/nagios/libexec/check_swap -w 20% -c 10%

command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /

command[check_ips]=/usr/local/nagios/libexec/ip_conn.sh 200 250

command[check_mem]=/usr/local/nagios/libexec/check_mem.sh -w 90% -c 95%

command[check_iostat]=/usr/local/nagios/libexec/check_iostat -w 5 -c 10

command[check_traffic]=/usr/local/nagios/libexec/check_traffic.sh -V 2c -C public -H localhost2 -I 2 -w 12,30 -c 15,35 -M –b

注意還要在前面設置給監控主機權限用以監控：

allowed_hosts=127.0.0.1,192.168.175.200

完成後，啓動nrpe：/usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d

最後在主控端添加要監控的服務如：

vim /usr/local/nagios/etc/object/services.cfg

define service{

host_name localhost2

service_description check-tcp-8080

check_command check_tcp!8080

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups sagroup

}

define service{

host_name localhost2

service_description check-http

check_command check_http!http:\/\/localhost2/test.jsp!5!10

max_check_attempts 5

normal_check_interval 3

retry_check_interval 2

check_period 24x7

notification_interval 10

notification_period 24x7

notification_options w,u,c,r

contact_groups sagroup

}

若是在/usr/local/nagios/libexec中已有的命令,那直接在被控端nrpe.cfg中添加命令,並在主控端的services.cfg中添加服務便可!

但上面有一些監控對象不是安裝的nagios-plugins插件裏自帶的，是我在網上找的一些腳本，這些腳本怎麼配置的呢？用監控服務的IP鏈接數來舉個例子吧！

1．放在/usr/local/nagios/libexec裏；

如：vim ip_conn.sh

#!/bin/sh

#if [ $# -ne 2 ]
#then
# echo "Usage:$0 -w num1 -c num2"
#exit 3
#fi
ip_conns=`netstat -an | grep tcp | grep EST | wc -l`
if [ $ip_conns -lt $1 ];
then
echo "OK -connect counts is $ip_conns"
exit 0
fi
if [ $ip_conns -gt $1 -a $ip_conns -lt $2 ];
then
echo "Warning -connect counts is $ip_conns"
exit 1
fi
if [ $ip_conns -gt $2 ];
then
echo "Critical -connect counts is $ip_conns"
exit 2
fi

2．修改全部者及其權限；

如：