部署看這個文檔 學習nagios仍是看馬哥的文檔 老男孩的文檔
本文主要上根據馬哥nagios的操做總結本身的操做。
監控端:
安裝前的準備工做
(1)解決安裝Nagios的依賴關係:
Nagios基本組件的運行依賴於httpd、gcc和gd。能夠經過如下命令來檢查nagios所依賴的rpm包是否已經徹底安裝:
# yum -y install httpd gcc glibc glibc-common gd gd-devel php php-mysql mysql mysql-devel mysql-server
說明:以上軟件包您也能夠經過編譯源代碼的方式安裝,只是後面許多要用到的相關文件的路徑等須要按照您的源代碼安裝時的配置逐一修改。此外,您還得按需啓動必要的服務,如httpd等。
(2)添加nagios運行所須要的用戶和組:
# groupadd nagcmd
# useradd -G nagcmd nagios
# passwd nagios
把apache加入到nagcmd組,以便於在經過web Interface操做nagios時可以具備足夠的權限:
# usermod -a -G nagcmd apache
三、編譯安裝nagios:
# tar zxf nagios-3.3.1.tar.gz
# cd nagios-3.3.1
# ./configure --with-command-group=nagcmd --enable-event-broker
# make all
# make install
# make install-init
# make install-commandmode
# make install-config
爲email指定您想用來接收nagios警告信息的郵件地址,默認是本機的nagios用戶:
# vi /usr/local/nagios/etc/objects/contacts.cfg
email nagios@localhost #這個是默認設置
在httpd的配置文件目錄(conf.d)中建立Nagios的Web程序配置文件:
# make install-webconf
建立一個登陸nagios web程序的用戶,這個用戶賬號在之後經過web登陸nagios認證時所用:
# htpasswd -c /usr/local/nagios/etc/htpasswd.users nagiosadmin
以上過程配置結束之後須要從新啓動httpd:
# service httpd restart
4. 編譯、安裝nagios-plugins nrpe
當Nagios須要監控某個遠程linux主機的服務或者資源狀況時
第一步:nagios服務器運行check_nrpe插件,咱們要在nagios配置文件中告訴它要檢查什麼
第二步:check_nrpe插件會經過SSL鏈接到遠程的被監控的Linux客戶端上的NRPE daemon
第三步:被監控的Linux客戶端上的NRPE daemon會運行相應的nagios插件來執行檢查本地資源或服務
第四步:被監控的Linux客戶端上的NRPE daemon的NRPE daemon將檢查的結果返回給check_nrpe插件,插件將其遞交給進行nagios作處理
注 意:NRPE daemon須要nagios-plugin插件安裝在遠程被監控linux主機上,不然NRPE daemon不能作任何的監控;別外由於它們間的通訊是加密的SSL,因此在編譯安裝時都要加上選項:/configure --enable-ssl --with-ssl-lib=/lib/,不然會出錯
nagios的全部監控工做都是經過插件完成的,所以,在啓動nagios以前還須要爲其安裝官方提供的插件。
# tar zxf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15
# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
# make
# make install
安裝nrpe(服務端nrpe安裝的時候加個--prefix路徑方便一些)
tar -zxvf nrpe-2.12.tar.gz && cd nrpe-2.12
./configure --prefix=/usr/local/nrpe --enable-ssl --with-ssl-lib (前提是已經安裝了openssl與openssl-devel)
make all
make install-plugin
make install-daemon
make install-daemon-config
配置nrpe
#配置nrpe信息
vi /usr/local/nagios/etc/nrpe.cfg,查找並修改以下一行
serverip = 本機ip
allowed_hosts=192.168.1.100,127.0.0.1 #注意修改成nagios服務器的IP:192.168.1.100
五、配置並啓動Nagios
(1)把nagios添加爲系統服務並將之加入到自動啓動服務隊列:
# chkconfig --add nagios
# chkconfig nagios on
(2)檢查其主配置文件的語法是否正確:
# /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
(3)若是上面的語法檢查沒有問題,接下來就能夠正式啓動nagios服務了:
# service nagios start
(4)配置selinux
若是您的系統開啓了selinux服務,則默認爲拒絕nagios web cgi程序的運行。您能夠經過下面的命令來檢查您的系統是否開啓了selinux:
#getenforce
若是上面命令的結果顯示開啓了selinux服務,您能夠經過下面的命令暫時性的將其關閉:
#setenforce 0
(5)關閉防火牆
(6)經過web界面查看nagios:
http://your_nagios_IP/nagios
被監控端:
1)先添加nagios用戶
# useradd -s /sbin/nologin nagios
2)NRPE依賴於nagios-plugins,所以,須要先安裝之
# tar zxf nagios-plugins-1.4.15.tar.gz
# cd nagios-plugins-1.4.15
# ./configure --with-nagios-user=nagios --with-nagios-group=nagios
# make all
# make install
3)安裝NRPE
# tar -zxvf nrpe-2.12.tar.gz
# cd nrpe-2.12.tar.gz
# ./configure --with-nrpe-user=nagios \
--with-nrpe-group=nagios \
--with-nagios-user=nagios \
--with-nagios-group=nagios \
--enable-command-args \
--enable-ssl
# make all
# make install-plugin
# make install-daemon
# make install-daemon-config
4)配置NRPE
# vim /usr/local/nagios/etc/nrpe.conf
log_facility=daemon
pid_file=/var/run/nrpe.pid
server_address=172.16.100.11
server_port=5666
nrpe_user=nagios
nrpe_group=nagios
allowed_hosts=172.16.100.1
command_timeout=60
connection_timeout=300
debug=0
上述配置指令能夠作到見名知義,所以,配置過程當中根據實際須要進行修改便可。其中,須要特定說明的是allowed_hosts指令用於定義本機所容許的監控端的IP地址。
5)啓動NRPE
# /usr/local/nagios/bin/nrpe -c /usr/local/nagios/etc/nrpe.cfg –d
爲了便於NRPE服務的啓動,能夠將以下內容定義爲/etc/init.d/nrped腳本:
#!/bin/bash
# chkconfig: 2345 88 12
# description: NRPE DAEMON
NRPE=/usr/local/nagios/bin/nrpe
NRPECONF=/usr/local/nagios/etc/nrpe.cfg
case "$1" in
start)
echo -n "Starting NRPE daemon..."
$NRPE -c $NRPECONF -d
echo " done."
;;
stop)
echo -n "Stopping NRPE daemon..."
pkill -u nagios nrpe
echo " done."
;;
restart)
$0 stop
sleep 2
$0 start
;;
*)
echo "Usage: $0 start|stop|restart"
;;
esac
exit 0
重要點:
監控端nrpe.conf 的統一地方,由於默認nrpe.conf 有不少沒有command 須要單獨設置 客戶端的nrpe.conf能夠都用這個
我已經下載了一個被監控端的nrpe.conf 能夠直接使用,到時候修改下
server_address=192.168.2.10
allowed_hosts=192.168.2.124,192.168.2.10
command[check_users]=/usr/local/nagios/libexec/check_users -w 5 -c 10
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_ping]=/usr/local/nagios/libexec/check_ping!100.0,20%!500.0,60%
command[check_load]=/usr/local/nagios/libexec/check_load -w 15,10,5 -c 30,25,20
command[check_sda1]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda1
command[check_sda2]=/usr/local/nagios/libexec/check_disk -w 20% -c 10% -p /dev/sda2
command[check_zombie_procs]=/usr/local/nagios/libexec/check_procs -w 5 -c 10 -s Z
command[check_total_procs]=/usr/local/nagios/libexec/check_procs -w 150 -c 200
command[check_mem]=/usr/local/nagios/libexec/check_mem -f -w 20 -c 10 -C
command[check_cpu]=/usr/local/nagios/libexec/check_cpu -w 60 -c 80
#command[check_ntp]=/usr/local/nagios/libexec/check_ntp -H 172.16.30.167 -w 0.5 -c 1
command[check_iostat]=/usr/local/nagios/libexec/check_iostat -d sda1 -w 5000 -c 6000
command[check_swap]=/usr/local/nagios/libexec/check_swap -w 40% -c 20%
command[check_disk]=/usr/local/nagios/libexec/check_disk -w 20 -c 10
command[check_ping]=/usr/local/nagios/libexec/check_ping -H 192.168.2.1 -w 100.0,20% -c 500.0,60%
監控端commands.cfg啓用nrpe命令 默認
/usr/local/nagios/libexec裏面沒有check_nrpe 把
/usr/local/nrpe/libexec/check_nrpe 複製到上面的目錄便可
#在commands.cfg中定義nrpe這個外部構件
vi /usr/local/nagios/etc/nagios.cfg,打開下面這一行
cfg_file=/usr/local/nagios/etc/objects/commands.cfg
vi /usr/local/nagios/etc/objects/commands.cfg,增長以下一行
#check nrpe
define command{
command_name check_nrpe
command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
}
客戶端和監控端的libexe我已經打包部署的時候能夠直接使用
監控端監控的時候 主機和服務分開 服務的每一個主機都分開
我已經下載好到時候用的時候也能夠直接使用
hosts.cfg
define host{
use linux-server
host_name 192.168.2.10
alias 192.168.2.10
address 192.168.2.10
}
define host{
use linux-server
host_name 192.168.2.123
alias 192.168.2.123
address 192.168.2.123
}
192.168.2.10.cfg
define service{
use generic-service
host_name 192.168.2.10
service_description check_disk
check_command check_nrpe!check_disk
}
define service{
use generic-service
host_name 192.168.2.10
service_description check_load
check_command check_nrpe!check_load
}
define service{
use generic-service
host_name 192.168.2.10
service_description check-users
check_command check_nrpe!check_users
}
define service{
use generic-service
host_name 192.168.2.10
service_description check_cpu
check_command check_nrpe!check_cpu
}
define service{
use generic-service
host_name 192.168.2.10
service_description check_iostat
check_command check_nrpe!check_iostat
}
define service{
use generic-service
host_name 192.168.2.10
service_description check_mem
check_command check_nrpe!check_mem
}
define service{
use generic-service
host_name 192.168.2.10
service_description check_swap
check_command check_nrpe!check_swap
}
define service{
use generic-service
host_name 192.168.2.10
service_description total_procs
check_command check_nrpe!check_total_procs
}
define service{
use generic-service
host_name 192.168.2.10
service_description check_zombie_procs
check_command check_nrpe!check_zombie_procs
}
#define service{
#use generic-service
#host_name 192.168.2.10
#service_description check_ntp
#check_command check_nrpe!check_ntp
#}
#define service{
#use generic-service
#host_name 192.168.2.10
#service_description check_ping
#check_command check_nrpe!check_ping
#}
/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
service nagios restart
php