圍繞日誌,挖掘其中更大價值,一直是咱們團隊所關注。在原有日誌實時查詢基礎上,今年SLS在DevOps領域完善了以下功能:html
今天咱們重點介紹下,日誌只能聚類和異常告警如何配合,更好的進行異常發現和告警web
一份Sys Log的原始數據,,而且開啓了日誌聚類服務,具體的狀態截圖以下:session
經過調整下面截圖中紅色框1的大小,能夠改變圖中紅色框2的結果,可是對於每一個最細粒度的pattern並不會改變,也就是說:子Pattern的結果是穩定且惟一的,咱們能夠經過子Pattern的Signature找到對應的原始日誌條目。機器學習
假設,咱們對這個子Pattern要進行監控:函數
msg:vm-111932.tc su: pam_unix(*:session): session closed for user root
對應的 signature_id : __log_signature__: 1814836459146662485學習
咱們獲得了上述pattern對應的原始日誌,能夠看下具體的數量在時間軸上的直返圖:spa
上圖中,咱們能夠發現,這個模式的日誌分佈不是很均衡,其中還有一些是沒有的,若是直接按照時間窗口統計數量,獲得的時序圖以下:unix
__log_signature__: 1814836459146662485 | select date_trunc('minute', __time__) as time, COUNT(*) as num from log GROUP BY time order by time ASC limit 10000
上述圖中咱們發現時間上並非連續的。所以,咱們須要對這條時序進行補點操做。日誌
__log_signature__: 1814836459146662485 | select time_series(time, '1m', '%Y-%m-%d %H:%i:%s', '0') as time, avg(num) as num from ( select __time__ - __time__ % 60 as time, COUNT(*) as num from log GROUP BY time order by time desc ) GROUP by time order by time ASC limit 10000
使用時序異常檢測函數: ts_predicate_armacode
__log_signature__: 1814836459146662485 | select ts_predicate_arma(to_unixtime(time), num, 5, 1, 1, 1, 'avg') from ( select time_series(time, '1m', '%Y-%m-%d %H:%i:%s', '0') as time, avg(num) as num from ( select __time__ - __time__ % 60 as time, COUNT(*) as num from log GROUP BY time order by time desc ) GROUP by time order by time ASC ) limit 10000
__log_signature__: 1814836459146662485 | select t1[1] as unixtime, t1[2] as src, t1[3] as pred, t1[4] as up, t1[5] as lower, t1[6] as prob from ( select ts_predicate_arma(to_unixtime(time), num, 5, 1, 1, 1, 'avg') as res from ( select time_series(time, '1m', '%Y-%m-%d %H:%i:%s', '0') as time, avg(num) as num from ( select __time__ - __time__ % 60 as time, COUNT(*) as num from log GROUP BY time order by time desc ) GROUP by time order by time ASC )) , unnest(res) as t(t1)
__log_signature__: 1814836459146662485 | select unixtime, src, pred, up, lower, prob from ( select t1[1] as unixtime, t1[2] as src, t1[3] as pred, t1[4] as up, t1[5] as lower, t1[6] as prob from ( select ts_predicate_arma(to_unixtime(time), num, 5, 1, 1, 1, 'avg') as res from ( select time_series(time, '1m', '%Y-%m-%d %H:%i:%s', '0') as time, avg(num) as num from ( select __time__ - __time__ % 60 as time, COUNT(*) as num from log GROUP BY time order by time desc ) GROUP by time order by time ASC )) , unnest(res) as t(t1) ) where is_nan(src) = false order by unixtime desc limit 2
__log_signature__: 1814836459146662485 | select sum(prob) as sumProb, max(src) as srcMax, max(up) as upMax from ( select unixtime, src, pred, up, lower, prob from ( select t1[1] as unixtime, t1[2] as src, t1[3] as pred, t1[4] as up, t1[5] as lower, t1[6] as prob from ( select ts_predicate_arma(to_unixtime(time), num, 5, 1, 1, 1, 'avg') as res from ( select time_series(time, '1m', '%Y-%m-%d %H:%i:%s', '0') as time, avg(num) as num from ( select __time__ - __time__ % 60 as time, COUNT(*) as num from log GROUP BY time order by time desc ) GROUP by time order by time ASC )) , unnest(res) as t(t1) ) where is_nan(src) = false order by unixtime desc limit 2 )
具體的告警設置以下:
這裏是日誌服務的各類功能的演示 日誌服務總體介紹,各類Demo
原文連接 本文爲雲棲社區原創內容,未經容許不得轉載。