聊聊redis的slowlog與latency monitor

本文主要研究一下redis的slowlog與latency monitorhtml

slowlog

redis在2.2.12版本引入了slowlog,用於記錄超過指定執行時間的命令,這個執行時間不包括諸如與客戶端通訊的IO操做耗時,是實實在在的命令執行的耗時。主要有以下操做:redis

查看slowlog的數量

127.0.0.1:6379> slowlog len
(integer) 1024

查看slowlog的執行耗時閾值

127.0.0.1:6379> config get slowlog-log-slower-than
1) "slowlog-log-slower-than"
2) "1000"

設置slowlog的執行耗時閾值

127.0.0.1:6379> config set slowlog-log-slower-than 1000
OK

查看slowlog保存數量的閾值

127.0.0.1:6379> config get slowlog-max-len
1) "slowlog-max-len"
2) "1024"

設置slowlog保存數量的閾值

127.0.0.1:6379> config set slowlog-max-len 1024
OK

查詢slowlog

127.0.0.1:6379> slowlog get 1
1) 1) (integer) 76016
   2) (integer) 1537250266
   3) (integer) 48296
   4) 1) "COMMAND"

第一行是命令id,第二行是timestamp,第三行是執行耗時,第四行是命令及參數this

清除slowlog記錄

127.0.0.1:6379> slowlog reset
OK

latency monitor

redis在2.8.13版本引入了latency monitoring,這裏主要是監控latency spikes(延時毛刺)。它基於事件機制進行監控,command事件是監控命令執行latency,fast-command事件是監控時間複雜度爲O(1)及O(logN)命令的latency,fork事件則監控redis執行系統調用fork(2)的latency。主要有以下操做:debug

設置/開啓latency monitor

127.0.0.1:6379> config set latency-monitor-threshold 100
OK

讀取latency monitor配置

127.0.0.1:6379> config get latency-monitor-threshold
1) "latency-monitor-threshold"
2) "100"

獲取最近的latency

127.0.0.1:6379> debug sleep 1
OK
(1.01s)
127.0.0.1:6379> debug sleep .25
OK
127.0.0.1:6379> latency latest
1) 1) "command"
   2) (integer) 1537268070
   3) (integer) 250
   4) (integer) 1010

返回事件名、發生的時間戳、最近的延時(毫秒)、最大的延時(毫秒)code

查看某一事件的延時歷史

127.0.0.1:6379> latency history command
1) 1) (integer) 1537268064
   2) (integer) 1010
2) 1) (integer) 1537268070
   2) (integer) 250

查看事件延時圖

127.0.0.1:6379> latency reset command
(integer) 0
127.0.0.1:6379> debug sleep .1
OK
127.0.0.1:6379> debug sleep .2
OK
127.0.0.1:6379> debug sleep .3
OK
127.0.0.1:6379> debug sleep .4
OK
127.0.0.1:6379> debug sleep .5
OK
(0.50s)
127.0.0.1:6379> latency graph
(error) ERR syntax error
127.0.0.1:6379> latency graph command
command - high 500 ms, low 100 ms (all time high 500 ms)
--------------------------------------------------------------------------------
   _#
  _||
 _|||
_||||

22117
4062s
ssss

重置/清空事件數據

127.0.0.1:6379> latency reset command
(integer) 1
127.0.0.1:6379> latency history command
(empty list or set)
127.0.0.1:6379> latency latest
(empty list or set)

診斷建議

127.0.0.1:6379> latency doctor
Dave, I have observed latency spikes in this Redis instance. You don't mind talking about it, do you Dave?

1. command: 6 latency spikes (average 257ms, mean deviation 142ms, period 3.83 sec). Worst all time event 500ms.

I have a few advices for you:

- Check your Slow Log to understand what are the commands you are running which are too slow to execute. Please check http://redis.io/commands/slowlog for more information.
- Deleting, expiring or evicting (because of maxmemory policy) large objects is a blocking operation. If you have very large objects that are often deleted, expired, or evicted, try to fragment those objects into multiple smaller objects.

小結

  • redis的slowlog在2.2.12版本引入,latency monitor在2.8.13版本引入
  • slowlog僅僅是記錄純命令的執行耗時,不包括與客戶端的IO交互及redis的fork等耗時
  • latency monitor監控的latency spikes則範圍廣一點,不只包括命令執行,也包括fork(2)系統調用,key過時等操做的耗時

doc

相關文章
相關標籤/搜索