NAMEnode
collectl - Collects data that describes the current system status.mysql
簡單翻譯成中文就是:收集當前系統狀態數據並予以顯示linux
Collectl是一個系統指標收集工具。能夠守護進程方式和交互方式運行。支持從一系列的子系統中收集數據。包含一個Graphite接口,使得數據能夠輕易地傳遞給Graphite進行存儲。ios
下面是官方的介紹:sql
There are a number of times in which you find yourself needing performance data. These can include benchmarking, monitoring a system's general heath or trying to determine what your system was doing at some time in the past. Sometimes you just want to know what the system is doing right now. Depending on what you're doing, you often end up using different tools, each designed to for that specific situation. Unlike most monitoring tools that either focus on a small set of statistics, format their output in only one way, run either interatively or as a daemon but not both, collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp.shell
下載: http://sourceforge.net/projects/collectl/files/ express
安裝就不囉嗦了,很是簡單!rpn包或源碼安裝!網絡
使用使用介紹app
collectl有三種運行模式:socket
1. Interactive Mode(交互模式): This is the default and in this mode data is read from /proc and passes through analyze.
2. Record Mode(記錄模式):read data from live system and write to file or display on terminal
使用語法:collectl [-f file] [options]
3. Playback Mode(回放模式):read data from one or more raw data files and display on terminal
使用語法:collectl -p file1 [file2 ...] [options]
衆多監控工具中、collectl支持的性能數據種類應該是最全的一個,監控的子系統項類型:
SUMMARY SUBSYSTEMS --摘要子系統:顯示的比較簡單.
b - buddy info (memory fragmentation)
c - CPU
d - Disk
f - NFS V3 Data
i - Inode and File System
j - Interrupts
l - Lustre
m - Memory
n - Networks
s - Sockets
t - TCP
x - Interconnect
y - Slabs (system object caches)
DETAIL SUBSYSTEMS --細節子系統:顯示比較詳細的信息.
C - CPU
D - Disk
E - Environmental data (fan, power, temp), via ipmitool
F - NFS Data
J - Interrupts
L - Lustre OST detail OR client Filesystem detail
M - Memory node data, which is also known as numa data
N - Networks
T - 65 TCP counters only available in plot format
X - Interconnect
Y - Slabs (system object caches)
Z - Processes
上面這些監控項目必需要以 -s 參數來指定,如:collectl -ss ,而且是運行在回放模式下.
經常使用的參數及說明:
collect 默認不帶參數的狀況下顯示以下:
[root@twexdb1 qzhijun]# collectl
waiting for 1 second sample...
#<----CPU[HYPER]-----><----------Disks-----------><----------Network---------->
#cpu sys inter ctxsw KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut
0 0 1032 439 0 0 0 0 2 23 6 21
0 0 1049 345 8 16 265 10 0 3 1 6
0 0 1074 229 0 0 0 0 3 25 6 23
0 0 1091 226 0 0 0 0 2 19 3 16
能夠看到顯示的內容:CPU/Disks/Network,顯示的比較簡單。
-s 顯示子系統
1.顯示摘要子系統信息指定項目信息:
舉例:
1).只顯示CPU的簡單信息
[root@twexdb1 qzhijun]# collectl -sc
waiting for 1 second sample...
#<----CPU[HYPER]----->
#cpu sys inter ctxsw
0 0 1099 342
0 0 1060 355
0 0 1115 266
0 0 1032 147
Ouch!
2).同時顯示內存和磁盤的簡單信息
[root@twexdb1 qzhijun]# collectl -sdm
waiting for 1 second sample...
#<-----------Memory-----------><----------Disks----------->
#Free Buff Cach Inac Slab Map KBRead Reads KBWrit Writes
118M 270M 5G 5G 223M 1G 0 0 264 8
118M 270M 5G 5G 223M 1G 0 0 0 0
118M 270M 5G 5G 223M 1G 0 0 52 10
119M 270M 5G 5G 223M 1G 8 16 1157 52
119M 270M 5G 5G 223M 1G 0 0 0 0
Ouch!
這個子系統也能夠原來collectl這個命令不帶任何參數的狀況下追加或減小顯示的信息,用+/-.
3).增長內存的顯示信息:
[root@twexdb1 qzhijun]# collectl -s+m
waiting for 1 second sample...
#<----CPU[HYPER]-----><-----------Memory-----------><----------Disks-----------><----------Network---------->
#cpu sys inter ctxsw Free Buff Cach Inac Slab Map KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut
0 0 2348 1851 116M 270M 5G 5G 223M 1G 0 0 0 0 2 22 4 19
1 0 3513 3354 116M 270M 5G 5G 223M 1G 0 0 316 18 78 777 120 701
0 0 1108 304 116M 270M 5G 5G 223M 1G 8 16 1 1 142 1605 184 1368
0 0 1151 683 115M 270M 5G 5G 223M 1G 0 0 28 4 9 65 31 60
Ouch!
4).同時增長內存與網絡的顯示信息:
[root@twexdb1 qzhijun]# collectl -s+mn
waiting for 1 second sample...
#<----CPU[HYPER]-----><-----------Memory-----------><----------Disks-----------><----------Network---------->
#cpu sys inter ctxsw Free Buff Cach Inac Slab Map KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut
0 0 1032 554 116M 270M 5G 5G 224M 1G 0 0 352 9 4 40 11 35
0 0 1032 180 116M 270M 5G 5G 224M 1G 0 0 0 0 1 11 2 12
0 0 1026 174 116M 270M 5G 5G 224M 1G 8 16 1 1 1 4 1 6
0 0 1032 177 116M 270M 5G 5G 224M 1G 0 0 0 0 1 4 1 7
Ouch!
5).在默認顯示信息的基礎上減去CPU的信息:
[root@twexdb1 qzhijun]# collectl -s-c
waiting for 1 second sample...
#<----------Disks-----------><----------Network---------->
#KBRead Reads KBWrit Writes KBIn PktIn KBOut PktOut
8 16 1 1 29 278 52 230
0 0 0 0 50 556 69 463
0 0 20 3 6 49 14 46
0 0 1516 81 74 675 235 603
8 16 337 8 2 18 8 21
0 0 0 0 1 4 1 6
Ouch!
2.顯示詳細子系統指定項目信息:
[root@twexdb1 qzhijun]# collectl -sD
waiting for 1 second sample...
# DISK STATISTICS (/sec)
# <---------reads---------><---------writes---------><--------averages--------> Pct
#Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util
c0d0 0 0 0 0 0 0 0 0 0 0 0 0 0
sda 8 0 16 1 0 0 1 1 0 1 0 0 0
sdb 0 0 0 0 44 5 6 7 7 2 0 0 0
sdc 0 0 0 0 0 0 0 0 0 0 0 0 0
dm-0 8 0 16 1 0 0 1 1 0 1 0 0 0
dm-1 0 0 0 0 44 0 11 4 4 4 0 0 0
dm-2 0 0 0 0 0 0 0 0 0 0 0 0 0
dm-3 0 0 0 0 0 0 0 0 0 0 0 0 0
c0d0 0 0 0 0 0 0 0 0 0 0 0 0 0
還能夠指定特定的磁盤:--dskfilt
[root@twexdb1 qzhijun]# collectl -sD --dskfilt sdb
waiting for 1 second sample...
監控某個特定的進程:
[root@twexdb1 qzhijun]# collectl -sZ --procfilt Cmysql --procopts c
waiting for 60 second sample...
# PROCESS SUMMARY (counters are /sec)
# PID User PR PPID THRD S VSZ RSS CP SysT UsrT Pct AccuTime MajF MinF Command
6839 root 18 1 0 S 10M 1M 3 0.00 0.00 0 00:00.09 0 0 /bin/sh
7002 mysql 14 6839 300 S 2G 1G 15 0.18 3.96 6 728:25:39 0 0 /usr/local/mysql/bin/mysqld
Ouch!
# DISK STATISTICS (/sec)
# <---------reads---------><---------writes---------><--------averages--------> Pct
#Name KBytes Merged IOs Size KBytes Merged IOs Size RWSize QLen Wait SvcTim Util
sdb 0 0 0 0 0 0 0 0 0 0 0 0 0
sdb 0 0 0 0 0 0 0 0 0 0 0 0 0
sdb 0 0 0 0 0 0 0 0 0 0 0 0 0
sdb 0 0 0 0 0 0 0 0 0 0 0 0 0
sdb 0 0 0 0 0 0 0 0 0 0 0 0 0
Ouch!
--procfilt Process Filters
c - substring of the command being executed as explicitly read from /proc/pid/stat. Note that this can actually be a perl expression, so if you
want a command that ends in a particular string all you need to is append a to the end of the string. Otherwise it would match any commands con-
taining that string.
C - any command that starts with the specified string
f - full path of the command, including arguments, as read from /proc/pid/cmdline. Like the c modifier this too can be a perl expression.
p - pid
P - parent pid
u - any process ownerd by this user’s UID or in the range specifide by uxxx-yyy
U - any process owned by this username
--top 相似以linux下面的top工具那樣實時顯示.
如:
collectl -sCj --top
--iosize :顯示平均的I/O大小(多了Size字段)
顯示時間戳:
-oT 顯示時間
-oD 顯示日期和時間
-oDm 顯示日期時間和毫秒
-i 指定監控時間間隔(以秒爲單位)
[root@twexdb1 qzhijun]# collectl -sm -i 2
waiting for 2 second sample...
#<-----------Memory----------->
#Free Buff Cach Inac Slab Map
120M 276M 5G 5G 224M 1G
120M 276M 5G 5G 224M 1G
120M 276M 5G 5G 224M 1G
120M 276M 5G 5G 224M 1G
121M 276M 5G 5G 224M 1G
121M 276M 5G 5G 223M 1G
例:
以1/4秒採集系統數據並保存到日誌文件中:
collectl -i.25 -oDm --iosize > testPerf.log
該程序還支持發送數據到遠程主機,請參看man說明: man collectl
[root@twexdb1 qzhijun]# collectl --help
This is a subset of the most common switches and even the descriptions are
abbreviated. To see all type 'collectl -x', to get started just type 'collectl'
usage: collectl [switches]
-c, --count count collect this number of samples and exit
-f, --filename file name of directory/file to write to
-i, --interval int collection interval in seconds [default=1]
-o, --options options misc formatting options, --showoptions for all
d|D - include date in output
T - include time in output
z - turn off compression of plot files
-p, --playback file playback results from 'file' (be sure to quote
if wild carded) or the shell might mess it up
-P, --plot generate output in 'plot' format
-s, --subsys subsys specify one or more subsystems [default=cdn]
--verbose display output in verbose format (automatically
selected when brief doesn't make sense)
Various types of help
-h, --help print this text
-v, --version print version
-V, --showdefs print operational defaults
-x, --helpextend extended help, more details descriptions too
-X, --helpall shows all help concatenated together
--showoptions show all the options
--showsubsys show all the subsystems
--showsubopts show all subsystem specific options
--showtopopts show --top options
--showheader show file header that 'would be' generated
--showcolheaders show column headers that 'would be' generated
--showslabaliases for SLUB allocator, show non-root aliases
--showrootslabs same as --showslabaliases but use 'root' names