1.存在性檢測node
功能:檢測文件或者服務不存在時進行相應的動做,默認是重啓apache
語法:app
IF [DOES] NOT EXIST [[<X>] <Y> CYCLES] THEN action [ELSE IF SUCCEEDED [[<X>] <Y> CYCLES] THEN action]
action的相關動做:ALERT告警 、RESTART 重啓 、START 啓動、 STOP關閉、EXEC 執行命令、 UNMONITOR 不監控dom
例子:ui
check process named with pidfile /var/run/named.pid start program = 「/etc/init.d/named start」 stop program = 「/etc/init.d/named stop」 if failed port 53 use type udp protocol dns then restart if 3 restarts within 5 cycles then timeout 若是53的udp端口不通,就重啓
2.資源檢測spa
功能:檢測對象的指標是否達到某個值,而後進行相應的動做 rest
語法:code
IF resource operator value [[<X>] <Y> CYCLES] THEN action [ELSE IF SUCCEEDED [[<X>] <Y> CYCLES] THEN action]
resource:就是監控的對象,如"CPU", "TOTALCPU", "CPU([user|system|wait])", "MEMORY", "SWAP", "CHILDREN", "TOTALMEMORY", "LOADAVG([1min|5min|15min])". orm
operator:邏輯判斷符號,如 > ,=,< 等server
action的相關動做:ALERT告警 、RESTART 重啓 、START 啓動、 STOP關閉、EXEC 執行命令、 UNMONITOR 不監控
例子:
check system myhost.mydomain.tld if loadavg (1min) > 4 then alert if loadavg (5min) > 2 then alert if memory usage > 75% then alert if cpu usage (user) > 70% then alert if cpu usage (system) > 30% then alert if cpu usage (wait) > 20% then alert
3.文件校驗碼檢測
功能:檢測文件是否發生變化
例子:
check file apache_bin with path /usr/local/apache/bin/httpd if failed checksum and expect the sum 8f7f419955cefa0b33a2ba316cba3659 then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor alert security@foo.bar on { checksum, permission, uid, gid, unmonitor } with the mail-format { subject: Alarm! } group server
4。文件大小檢測
check file with path /home/laicb/test.txt if does not exist for 5 cycles then alert if changed size for 1 cycles then alert //若是沒有指定,查看服務所對應的會發現是for 5 times within 5cycles
5.UID GID檢測
check file passwd with path /etc/passwd if failed uid root then unmonitor check file shadow with path /etc/shadow if failed gid root then unmonitorv
6.pid文件檢測,及運行時間檢測
check process myapp with pidfile /var/run/myapp.pid start program = "/etc/init.d/myapp start" stop program = "/etc/init.d/myapp stop" if uptime > 3 days then restart
7。監控磁盤空間
check filesystem datafs with path /dev/sdb1 group server start program = "/bin/mount /data" stop program = "/bin/umount /data" if failed permission 660 then unmonitor if failed uid root then unmonitor if failed gid disk then unmonitor if space usage > 80 % then alert if space usage > 94 % then stop if inode usage > 80 % then alert if inode usage > 94 % then stop alert root@localhost
8.icmp檢測
check host www.tildeslash.com with address www.tildeslash.com if failed icmp type echo count 5 with timeout 15 seconds then alert