explianit-cisco-2019-sigmod

causation理論的一點應用。證實分值不是偶然發生的算法

RCA的工具通常能夠query and classify anomalies,相關性分析(causal probabilistic gaphical models)express

  • spurious correlations。當dimensionality比data points數量多
  • 交互式查詢,target metrics of interest(Y),正常和異常時間段,specificity metrics for control(可選Z),search space of metrics(可選X)=》TOP 20 root-cause insearchspace:scores(Xi)<-assoc(Y,Xi|Z)

原理

causal bayesian network。嗯,能夠用帶條件的兩個變量關係去構造複雜的關係。app

- ExplainIt!– A Declarative Root-cause Analysis Engine for Time Series Data
 - Why? The above approach offers three main benefits. 
 - First, the formalism is a non-parametric and declarative way of expressing dependencies between variables and defers any specific approach to the runtime system. 
 - Second, the unified approach naturally lends itself to multivariate dependencies of more complex relationships beyond simple correlations between pairwise univariate metrics. 
 - Third, the approach also gives us a way to reason about dependencies that might be easier to detect only when holding some variables con- stant;

1.feature family (能夠按照host聚合,相似group by。好比某個feature family是75th延時,當前clusterjobs數量)dom

2.ranking 假設(X,Y,Z)=》給出Xi的排序
單變量Z空score:X中每一個Xi,Y中每一個Yj,Pearson product-moment coorelation 的均值和最值 coorMean=meani,j|pi,j|。
多變量Z空,線性迴歸(random projection降維)+loss function 計算R方
Z不空:迴歸Y~Z,X~Z.獲得RY;X.,RX;Z. 迴歸兩個R計算R2(Y;X|Z)
當X中predictors不少,observations不多時。用Ridge penalty達到了和adjusted R2同樣的效果。見後文。ide

實驗是否可以補全圖工具

評估

打分方法的評估:
ranking accuracy:cause是第r個,1/r
success rate: cause in topk 得1,不然0ui

理論

PC/SGS算法 use pairwise conditional independence=>full causal structture.also considering a joint set of variables.
rarely requires the full causal structuewspa

給出了過擬合 用radj。當一個score至少大於s是意外正常發生的機率和n,p的關係。當s小於這個值時不可信的。rest

相關文章
相關標籤/搜索
本站公眾號
   歡迎關注本站公眾號,獲取更多信息