Notes:
Red font color means get monitor values by Linux Command Line
Blue font color means Zabbix Agent Standard Items Key
Green font color means user-defined configuration in zabbix_agentd.userparams.conf and user-defined items key
Item description:
The load average represents the average system load over a period of time; it can help to find out the capacity of CPU
Linux Command:
# uptime
01:25:05 up 51 days, 1:23, 3 users, load average: 3.00, 3.05, 3.02
Zabbix Item Key:
Suggested Zabbix Trigger:
{system.cpu.load[,avg1].min(300)}>10
Item description:
Show the percentage of total CPU utilization,For multiprocessor systems, the CPU values are global averages among all processors, it can help to find out system load problems
Linux Command:
avg-cpu: %user %nice %system %iowait %steal %idle
0.15 0.01 1.61 0.23 0.00 98.01
Calculation:
CPU utilization=%user+%nice+%system=0.15+0.01+1.61=1.77
last["system.cpu.util[,user,avg1]","system.cpu.util[,nice,avg1]","system.cpu.util[,system,avg1]"]
Formula in Zabbix:
last("system.cpu.util[,user,avg1]")+last("system.cpu.util[,nice,avg1]")+last("system.cpu.util[,system,avg1]")
Suggested Zabbix Trigger:
{last["system.cpu.util[,user,avg1]","system.cpu.util[,nice,avg1]","system.cpu.util[,system,avg1]"].min(300)}>10
Item description:
Show the percentage of CPU utilization that occurred while executing at the system level (kernel).
Linux Command:
#iostat –x 60
avg-cpu: %user %nice %system %iowait %steal %idle
0.15 0.01 1.61 0.23 0.00 98.01
Zabbix Item Key:
system.cpu.util[,system,avg1]
Suggested Zabbix Trigger:
{system.cpu.util[,system,avg1].min(300)}>10
Item description:
Show the percentage of CPU utilization that occurred while executing at the user level (application).
Linux Command:
#iostat –x 60
avg-cpu: %user %nice %system %iowait %steal %idle
0.15 0.01 1.61 0.23 0.00 98.01
system.cpu.util[,user,avg1]
Suggested Zabbix Trigger:
{system.cpu.util[,user,avg1].min(300)}>10
Item description:
Display the total amount of free swap memory used in the system; it can help to find out memory usage problems
Linux Command:
#free
total used free shared buffers cached
Mem: 252376 193632 58744 0 87840 52488
-/+ buffers/cache: 53304 199072
Swap: 522104 0 522104
Suggested Zabbix Trigger:
{system.swap.size[,used].last(0)}>1000000
Item description:
The buffer cache helps programs to get to their data blocks faster by keeping recently used file blocks in memory, it can help MySQL tuning
Linux Command:
#free
total used free shared buffers cached
Mem: 252376 193632 58744 0 87840 52488
-/+ buffers/cache: 53304 199072
Swap: 522104 0 522104
Zabbix Item Key:
vm.memory.size[buffers]
Suggested Zabbix Trigger:
not necessary
Item description:
Show the MySQL instance physical memory usage, it can help MySQL tuning
Linux Command:
# cat /proc/$(pgrep -u mysql mysqld)/status |grep VmRSS
VmRSS: 11744 Kb
UserParameter=mysql.mem.use,cat /proc/$(pgrep -u mysql mysqld)/status |grep VmRSS |awk '{print $2}'
Zabbix Item Key:
Suggested Zabbix Trigger(It depends on Business Scenarios):
{mysql.mem.use.min(300)}>3000000
Item description:
Show the NIC traffic receive and transmit, it can help to find out the abnormal traffic throughput between MySQL and application servers
Linux Command:
# ifconfig
RX:
net.if.in[eth3,bytes] net.if.in[eth0,bytes]
TX:
net.if.out[eth3,bytes] net.if.out[eth0,bytes]
Suggested Zabbix Trigger(It depends on Business Scenarios):
{net.if.in[eth0,bytes].min(300)}>10000000
{net.if.out[eth0,bytes].min(300)}>10000000
{net.if.in[eth3,bytes].min(300)}>10000000
{net.if.out[eth3,bytes].min(300)}>10000000
Item description:
Show the count of bytes in TCP receive queue and send queue, it can help to observe the TCP socket performance
#netstat -ant|grep ESTABLISHED|awk '{sum += $2} END {print sum}'
0
User-defined configuration:
UserParameter=TCP.Recv.Queue,netstat -ant|grep ESTABLISHED|awk '{sum += $2} END {print sum}'
Zabbix Item Key:
TCP.Recv.Queue
Suggested Zabbix Trigger:
not necessary
Linux Command:
#netstat -ant|grep ESTABLISHED|awk '{sum += $3} END {print sum}'
194032
User-defined configuration:
UserParameter=TCP.Send.Queue,netstat -ant|grep ESTABLISHED|awk '{sum += $3 } END {print sum}'
Zabbix Item Key:
TCP.Send.Queue
Suggested Zabbix Trigger:
not necessary
Item description:
Show NUMA scheduler work status, it can help system performance tuning
Linux Command:
# numastat
node0 node1
numa_hit 466451682 650742379
numa_miss 0 111010
numa_foreign 111010 0
interleave_hit 928427 929122
local_node 452653412 650663005
other_node 13798270 190384
User-defined configuration:
UserParameter=numa.node0.hit,numastat |grep numa_hit|awk '{print $2}'
UserParameter=numa.node1.hit,numastat |grep numa_hit|awk '{print $3}'
UserParameter=numa.node0.miss,numastat |grep numa_miss|awk '{print $2}'
UserParameter=numa.node1.miss,numastat |grep numa_miss|awk '{print $3}'
Zabbix Item Key:
numa.node0.hit
numa.node1.hit
numa.node0.miss
numa.node1.miss
Suggested Zabbix Trigger:
not necessary
Item description:
Show the MySQL data partition's IO scheduler mode; it can help MySQL performance tuning
Linux Command:
#cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]
User-defined configuration:
UserParameter=sda.io.scheduler,tr -s " " "\n" < /sys/block/sda/queue/scheduler|grep '\[.*\]'
Zabbix Item Key:
sda.io.scheduler
Suggested Zabbix Trigger:
Item description:
Show the MySQL data partition's IO status; it can help MySQL performance tuning, and help to find out the performance issues
Notes:
await=average IO Response Time (in milliseconds)
avgrq-sz =average IO Chunk Size (in sectors)
avgqu-sz=average IO queue length
r/s+w/s=IO Per Second
rkB/s+wkB/s=Throughput(The number of kilobytes)
%util=Percentage of CPU time during I/O requests
Linux Command:
# iostat -k -x -d 60
Device: rrqm/swrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util
sda 0.00 0.37 0.00 0.58 0.00 3.80 13.03 0.00 0.17 0.09 0.01
sda2 0.000.37 0.00 0.58 0.00 3.80 13.03 0.00 0.17 0.09 0.01
Specify 60 seconds between each report and display statistics in kilobytes per second
#nohup iostat -k -x -d 60 > /tmp/iostat_output &
Notes: Hypothesis MySQL data partition is sda2
User-defined configuration:
UserParameter=sda2.rps,grep sda2 /tmp/iostat_output |tail -1|awk '{print $4}'
UserParameter=sda2.wps,grep sda2 /tmp/iostat_output |tail -1|awk '{print $5}'
UserParameter=sda2.rkBps,grep sda2 /tmp/iostat_output |tail -1|awk '{print $6}'
UserParameter=sda2.wkBps,grep sda2 /tmp/iostat_output |tail -1|awk '{print $7}'
UserParameter=sda2.avgrq-sz,grep sda2 /tmp/iostat_output |tail -1|awk '{print $8}'
UserParameter=sda2.avgqu-sz,grep sda2 /tmp/iostat_output |tail -1|awk '{print $9}'
UserParameter=sda2.await,grep sda2 /tmp/iostat_output |tail -1|awk '{print $10}'
UserParameter=sda2.util,grep sda2 /tmp/iostat_output |tail -1|awk '{print $12}'
Zabbix Item Key:
sda2.rps
sda2.wps
sda2.rkBps
sda2.wkBps
sda2.avgrq-sz
sda2.avgqu-sz
sda2.await
sda2.util
IO Per Second = last["sda2.rps ","sda2.wps "]
IO Throughput = last["sda2.rkBps","sda2.wkBps"]
Formula in Zabbix:
IO Per Second =last("sda2.rps")+last("sda2.wps")
IO Throughput =last("sda2.rkBps")+last("sda2.wkBps")
Suggested Zabbix Trigger(It depends on Business Scenarios):
{last("sda2.rps")+last("sda2.wps").min(300)}>1000
{last("sda2.rkBps")+last("sda2.wkBps").min(300)}>1000
Item description:
It is related to the dirty page write to DISK, help to data persistence
Linux Command:
# cat /proc/sys/vm/dirty_writeback_centisecs
499
User-defined configuration:
UserParameter=fsync.flush.time,cat /proc/sys/vm/dirty_writeback_centisecs
fsync.flush.time
Suggested Zabbix Trigger:
not necessary
Item description:
Show the number of file descriptors open in all processes, it can help to find out the potential program issues or the network performance problem
Linux Command:
510 0 3223952
User-defined configuration:
UserParameter=file.desciptor.used,cat /proc/sys/fs/file-nr|awk '{print $1}'
Zabbix Item Key:
file.desciptor.used
Suggested Zabbix Trigger:
{file.desciptor.used.min(300)}>10000
Item description:
Show the RAID degradation status, it can help to find out the HARD DISK work status
Linux Command:
Install MegaCli rpm Package:
#rpm -ivh MegaCli-1.01.39-0.i386.rpm
# MegaCli64 -cfgdsply -aALL|grep State
State: Optimal
State: Optimal
User-defined configuration:
UserParameter=raid.status,sudo /opt/MegaRAID/MegaCli/MegaCli64 –cfgdsply –aALL -NoLog|grep Optimal|wc -l
Zabbix Item Key:
raid.status
Suggested Zabbix Trigger:
{raid.status.last(0)}<2