一 應用場景描述html
線上業務使用RabbitMQ做爲消息隊列中間件,那麼做爲運維人員對RabbitMQ的監控就很重要,本文就針對如何從頭至尾使用Zabbix來監控RabbitMQ進行說明。node
二 RabbitMQ監控要點python
RabbitMQ官方提供兩種方法來管理和監控RabbitMQ。ios
1.使用rabbitmqctl管理和監控git
Usage:github
rabbitmqctl [-n <node>] [-q] <command> [<command options>] shell
查看虛擬主機
數據庫
# rabbitmqctl list_vhostsjson
查看隊列api
# rabbitmqctl list_queues
查看exchanges
# rabbitmqctl list_exchanges
查看用戶
# rabbitmqctl list_users
查看鏈接
# rabbitmqctl list_connections
查看消費者信息
# rabbitmqctl list_consumers
查看環境變量
# rabbitmqctl environment
查看未被確認的隊列
# rabbitmqctl list_queues name messages_unacknowledged
查看單個隊列的內存使用
# rabbitmqctl list_queues name memory
查看準備就緒的隊列
# rabbitmqctl list_queues name messages_ready
2.使用RabbitMQ Management插件來監控和管理
開啓Management插件
# rabbitmq-plugins enable rabbitmq_management
經過這樣的網址訪問能夠看到RabbitMQ的狀態
http://172.28.2.157:15672/cli/rabbitmqadmin
下載rabbitmqadmin管理工具
獲取vhost列表
# curl -i -u guest:guest http://localhost:15672/api/vhosts
獲取頻道列表,限制顯示格式
# curl -i -u guest:guest "http://localhost:15672/api/channels?sort=message_stats.publish_details.rate&sort_reverse=true&columns=name,message_stats.publish_details.rate,message_stats.deliver_get_details.rate"
顯示歸納信息
# curl -i -u guest:guest "http://localhost:15672/api/overview"
management_version 管理插件版本
cluster_name 整個RabbitMQ集羣的名稱,經過rabbitmqctl set_cluster_name 進行設置
publish 發佈的消息總數
queue_totals 顯示準備就緒的消息,未確認的消息,未提交的消息等
statistics_db_event_queue 顯示還未必數據庫處理的事件數量
consumers 消費者個數
queues 隊列長度
exchanges 隊列交換機的數量
connections 鏈接數
channels 頻道數量
顯示節點信息
# curl -i -u guest:guest "http://localhost:15672/api/nodes"
disk_free 磁盤剩餘空間,以字節表示
disk_free_limit 磁盤報警的閥值
fd_used 使用掉的文件描述符數量
fd_total 可用的文件描述符數量
io_read_avg_time 讀操做平均時間,毫秒爲單位
io_read_bytes 總共讀入磁盤數據大小,以字節爲單位
io_read_count 總共讀操做的數量
io_seek_avg_time seek操做的平均時間,毫秒單位
io_seek_count seek操做總量
io_sync_avg_time fsync操做的平均時間,毫秒爲單位
io_sync_count fsync操做的總量
io_write_avg_time 每一個磁盤寫操做的平均時間,毫秒爲單位
io_write_bytes 寫入磁盤數據總量,以字節爲單位
io_write_count 磁盤寫操做總量
mem_used 內存使用字節
mem_limit 內存報警閥值,默認是總的物理內存的40%
mnesia_disk_tx_count 須要寫入到磁盤的Mnesia事務的數量
mnesia_ram_tx_count 不須要寫入到磁盤的Mnesia事務的數量
msg_store_write_count 寫入到消息存儲的消息數量
msg_store_read_count 從消息存儲讀入的消息數量
proc_used Erlang進程的使用數量
proc_total Erlang進程的最大數量
queue_index_journal_write_count 寫入到隊列索引日誌的記錄數量。每條記錄表示一個被髮布到隊列,從消息隊列中被投遞出或者在消息隊列中被q確認的消息
queue_index_read_count 從隊列索引讀出的記錄數量
queue_index_write_count 寫入到隊列索引的記錄數量
sockets_used 以socket方式使用掉的文件描述符數量
partitions
uptime 自從Erlang VM啓動時,運行的時間,單位好毫秒
run_queue 等待運行的Erlang進程數量
processors 檢測到被Erlang進程使用到的內核數量
net_ticktime 當前設置的內核tick time
查看頻道信息
# curl -i -u guest:guest "http://localhost:15672/api/channels"
查看交換機信息
# curl -i -u guest:guest "http://localhost:15672/api/exchanges"
查看隊列信息
# curl -i -u guest:guest "http://localhost:15672/api/queues"
查看vhosts信息
# curl -i -u guest:guest "http://localhost:15672/api/vhosts/?name=/"
三 編寫監控腳本和添加Zabbix配置文件
監控腳本主要包括三個部分,監控overview,監控當前主機的節點信息,還有監控各個隊列
根據網上的腳本進行了修改,新增長了不少監控項目,把原來腳本中的filter去掉了
這裏順便提一下,對於網上的各類代碼,不能拿來就用,要結合自身的需求對代碼進行分析,也能夠提高本身的編碼能力,若是隻是一味地拿來就用,那永遠也得不到提升。
rabbitmq_status.py
#!/usr/bin/env /usr/bin/python '''Python module to query the RabbitMQ Management Plugin REST API and get results that can then be used by Zabbix. https://github.com/jasonmcintosh/rabbitmq-zabbix ''' ''' This script is tested on RabbitMQ 3.5.3 ''' import json import optparse import socket import urllib2 import subprocess import tempfile import os import logging logging.basicConfig(filename='/opt/logs/zabbix/rabbitmq_zabbix.log', level=logging.WARNING, format='%(asctime)s %(levelname)s: %(message)s') class RabbitMQAPI(object): '''Class for RabbitMQ Management API''' def __init__(self, user_name='guest', password='guest', host_name='', protocol='http', port=15672, conf='/opt/app/zabbix/conf/zabbix_agentd.conf', senderhostname=None): self.user_name = user_name self.password = password self.host_name = host_name or socket.gethostname() self.protocol = protocol self.port = port self.conf = conf or '/opt/app/zabbix/conf/zabbix_agentd.conf' self.senderhostname = senderhostname if senderhostname else host_name def call_api(self, path): ''' All URIs will server only resource of type application/json,and will require HTTP basic authentication. The default username and password is guest/guest. /%sf is encoded for the default virtual host '/' ''' url = '{0}://{1}:{2}/api/{3}'.format(self.protocol, self.host_name, self.port, path) password_mgr = urllib2.HTTPPasswordMgrWithDefaultRealm() password_mgr.add_password(None, url, self.user_name, self.password) handler = urllib2.HTTPBasicAuthHandler(password_mgr) logging.debug('Issue a rabbit API call to get data on ' + path) ######## json.loads() transfer json data to python data ######## json.dump() transfer python data to json data return json.loads(urllib2.build_opener(handler).open(url).read()) def list_queues(self): ''' curl -i -u guest:guest http://localhost:15672/api/queues return a list ''' queues = [] for queue in self.call_api('queues'): logging.debug("Discovered queue " + queue['name']) element = {'{#VHOSTNAME}': queue['vhost'], '{#QUEUENAME}': queue['name'] } queues.append(element) logging.debug('Discovered queue '+queue['vhost']+'/'+queue['name']) return queues def list_nodes(self): '''Lists all rabbitMQ nodes in the cluster''' nodes = [] for node in self.call_api('nodes'): # We need to return the node name, because Zabbix # does not support @ as an item parameter name = node['name'].split('@')[1] element = {'{#NODENAME}': name, '{#NODETYPE}': node['type']} nodes.append(element) logging.debug('Discovered nodes '+name+'/'+node['type']) return nodes def check_queue(self): '''Return the value for a specific item in a queue's details.''' return_code = 0 #### use tempfile module to create a file on memory, will not be deleted when it is closed , because 'delete' argument is set to False rdatafile = tempfile.NamedTemporaryFile(delete=False) for queue in self.call_api('queues'): self._get_queue_data(queue, rdatafile) rdatafile.close() return_code = self._send_queue_data(rdatafile) #### os.unlink is used to remove a file os.unlink(rdatafile.name) return return_code def _get_queue_data(self, queue, tmpfile): '''Prepare the queue data for sending''' ''' ### one single queue's information like this ##### ### curl -i -u guest:guest http://localhost:15672/api/queues dumps a list ### {"memory":32064,"message_stats":{"ack":3870,"ack_details":{"rate":0.0},"deliver":3871,"deliver_details":{"rate":0.0},"deliver_get":3871,"deliver_get_details":{"rate":0.0},"disk_writes":3870,"disk_writes_details":{"rate":0.0},"publish":3870,"publish_details":{"rate":0.0},"redeliver":1,"redeliver_details":{"rate":0.0}},"messages":0,"messages_details":{"rate":0.0},"messages_ready":0,"messages_ready_details":{"rate":0.0},"messages_unacknowledged":0,"messages_unacknowledged_details":{"rate":0.0},"idle_since":"2016-03-01 22:04:22","consumer_utilisation":"","policy":"","exclusive_consumer_tag":"","consumers":4,"recoverable_slaves":"","state":"running","messages_ram":0,"messages_ready_ram":0,"messages_unacknowledged_ram":0,"messages_persistent":0,"message_bytes":0,"message_bytes_ready":0,"message_bytes_unacknowledged":0,"message_bytes_ram":0,"message_bytes_persistent":0,"disk_reads":0,"disk_writes":3870,"backing_queue_status":{"q1":0,"q2":0,"delta":["delta",0,0,0],"q3":0,"q4":0,"len":0,"target_ram_count":"infinity","next_seq_id":3870,"avg_ingress_rate":0.060962064328682466,"avg_egress_rate":0.060962064328682466,"avg_ack_ingress_rate":0.060962064328682466,"avg_ack_egress_rate":0.060962064328682466},"name":"app000","vhost":"/","durable":true,"auto_delete":false,"arguments":{},"node":"rabbit@test2"} ''' for item in [ 'memory','messages','messages_ready','messages_unacknowledged','consumers' ]: #key = rabbitmq.queues[/,queue_memory,queue.helloWorld] key = '"rabbitmq.queues[{0},queue_{1},{2}]"'.format(queue['vhost'], item, queue['name']) ### if item is in queue,value=queue[item],else value=0 value = queue.get(item, 0) logging.debug("SENDER_DATA: - %s %s" % (key,value)) tmpfile.write("- %s %s\n" % (key, value)) ## This is a non standard bit of information added after the standard items for item in ['deliver_get', 'publish']: key = '"rabbitmq.queues[{0},queue_message_stats_{1},{2}]"'.format(queue['vhost'], item, queue['name']) value = queue.get('message_stats', {}).get(item, 0) logging.debug("SENDER_DATA: - %s %s" % (key,value)) tmpfile.write("- %s %s\n" % (key, value)) def _send_queue_data(self, tmpfile): '''Send the queue data to Zabbix.''' '''Get key value from temp file. ''' args = '/opt/app/zabbix/sbin/zabbix_sender -c {0} -i {1}' if self.senderhostname: args = args + " -s " + self.senderhostname return_code = 0 process = subprocess.Popen(args.format(self.conf, tmpfile.name), shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out, err = process.communicate() logging.debug("Finished sending data") return_code = process.wait() logging.info("Found return code of " + str(return_code)) if return_code != 0: logging.warning(out) logging.warning(err) else: logging.debug(err) logging.debug(out) return return_code def check_aliveness(self): '''Check the aliveness status of a given vhost. ''' '''virtual host '/' should be encoded as '/%2f' ''' return self.call_api('aliveness-test/%2f')['status'] def check_overview(self, item): '''First, check the overview specific items''' ''' curl -i -u guest:guest http://localhost:15672/api/overview ''' ## rabbitmq[overview,connections] if item in [ 'channels','connections','consumers','exchanges','queues' ]: return self.call_api('overview').get('object_totals').get(item,0) ## rabbitmq[overview,messages] elif item in [ 'messages','messages_ready','messages_unacknowledged' ]: return self.call_api('overview').get('queue_totals').get(item,0) elif item == 'message_stats_deliver_get': return self.call_api('overview').get('message_stats', {}).get('deliver_get',0) elif item == 'message_stats_publish': return self.call_api('overview').get('message_stats', {}).get('publish',0) elif item == 'message_stats_ack': return self.call_api('overview').get('message_stats', {}).get('ack',0) elif item == 'message_stats_redeliver': return self.call_api('overview').get('message_stats', {}).get('redeliver',0) elif item == 'rabbitmq_version': return self.call_api('overview').get('rabbitmq_version', 'None') def check_server(self,item,node_name): '''Return the value for a specific item in a node's details. ''' '''curl -i -u guest:guest http://localhost:15672/api/nodes''' '''return a list''' # hostname hk-prod-mq1.example.com # self.call_api('nodes')[0]['name'] rabbit@hk-prod-mq1 node_name = node_name.split('.')[0] for nodeData in self.call_api('nodes'): if node_name in nodeData['name']: return nodeData.get(item,0) return 'Not Found' def main(): '''Command-line parameters and decoding for Zabbix use/consumption.''' choices = ['list_queues', 'list_nodes', 'queues', 'check_aliveness', 'overview','server'] parser = optparse.OptionParser() parser.add_option('--username', help='RabbitMQ API username', default='guest') parser.add_option('--password', help='RabbitMQ API password', default='guest') parser.add_option('--hostname', help='RabbitMQ API host', default=socket.gethostname()) parser.add_option('--protocol', help='RabbitMQ API protocol (http or https)', default='http') parser.add_option('--port', help='RabbitMQ API port', type='int', default=15672) parser.add_option('--check', type='choice', choices=choices, help='Type of check') parser.add_option('--metric', help='Which metric to evaluate', default='') parser.add_option('--node', help='Which node to check (valid for --check=server)') parser.add_option('--conf', default='/opt/app/zabbix/conf/zabbix_agentd.conf') parser.add_option('--senderhostname', default='', help='Allows including a sender parameter on calls to zabbix_sender') (options, args) = parser.parse_args() if not options.check: parser.error('At least one check should be specified') logging.debug("Started trying to process data") api = RabbitMQAPI(user_name=options.username, password=options.password, host_name=options.hostname, protocol=options.protocol, port=options.port, conf=options.conf, senderhostname=options.senderhostname) if options.check == 'list_queues': print json.dumps({'data': api.list_queues()},indent=4,separators=(',',':')) elif options.check == 'list_nodes': print json.dumps({'data': api.list_nodes()},indent=4,separators=(',',':')) elif options.check == 'queues': print api.check_queue() elif options.check == 'check_aliveness': print api.check_aliveness() elif options.check == 'overview': #rabbitmq[overview,connections] #--check=overview --metric=connections if not options.metric: parser.error('Missing required parameter: "metric"') else: if options.node: print api.check_overview(options.metric) else: print api.check_overview(options.metric) elif options.check == 'server': #rabbitmq[server,sockets_used] #--check=server --metric=sockets_used if not options.metric: parser.error('Missing required parameter: "metric"') else: if options.node: print api.check_server(options.metric,options.node) else: print api.check_server(options.metric,api.host_name) if __name__ == '__main__': main()
腳本思路:
使用urllib2模塊訪問RabbitMQ的API接口
對API接口返回的數據進行處理
overview和nodes的數據經過zabbix_agent獲取,queues經過zabbix_sender推送給zabbix,zabbix_sender推送以前須要有一個zabbix_agent的key進行主動觸發
rabbitmq_status.conf
UserParameter=rabbitmq.discovery_queue,/usr/bin/python /opt/app/zabbix/sbin/rabbitmq_status.py --check=list_queues UserParameter=rabbitmq.queues,/usr/bin/python /opt/app/zabbix/sbin/rabbitmq_status.py --check=queues UserParameter=rabbitmq[*],/usr/bin/python /opt/app/zabbix/sbin/rabbitmq_status.py --check=$1 --metric=$2
四 添加Zabbix監控模板
模板參加附件
參考文檔:
http://blog.thomasvandoren.com/monitoring-rabbitmq-queues-with-zabbix.html
http://www.rabbitmq.com/how.html#management
https://github.com/alfss/zabbix-rabbitmq
https://cdn.rawgit.com/rabbitmq/rabbitmq-management/rabbitmq_v3_6_0/priv/www/api/index.html
https://github.com/jasonmcintosh/rabbitmq-zabbix
http://chase-seibert.github.io/blog/2011/07/01/checking-rabbitmq-queue-sizeage-with-nagios.html