算法描述:算法
輸入:訓練數據$T={(x_{1},y_{1}),(x_{2},y_{2}),...,(x_{N},y_{N})}$,其中$x_{i}=(x_{i}^{(1)},x_{i}^{(2)},...,x_{i}^{(n)})$,$x_{i}^{(j)}$是第i個樣本的第j個特徵,$x_{i}^{(j)}\in \{ a_{j1},a_{j2},...,a_{js} \}$,$a_{jl}$表示第j個特徵可能取的第l個值,j=1,2,...,n,l=1,2,...,Sj,$y_{i} \in \{ c_{1},c_{2},...,c_{k} \}$;實例x;數據
輸出:實例x的分類實例
(1) 計算先驗機率以及條件機率
$P(Y=c_{k})=\frac{\sum_{i=1}^{N}I(y_{i}=c_{k})}{N},k=1,2,...,K$
$P(X^{(j)}=a_{jl}|Y=c_{k})=\frac{\sum_{i=1}^{N}I(x_{i}^{(j)}=a_{jl},y_{i}=c_{k})}{\sum_{i=1}^{N}I(y_{i}=c_{k})},$
$j=1,2,...,n;l=1,2,...,S_{j};k=1,2,...,K$
(2)對於給定的實例$x=(x^{(1)},x^{(2)},...,x^{(n)})^{T}$,計算
$P(Y=c_{k})\prod_{j=1}^{n} P(X^{(j)}=x^{(j)}|Y=c_{k}),k=1,2,...,K$
(3)肯定實例x的類別
$y=arg \max_{c_{k}}P(Y=c_{k})\prod_{j=1}^{n}P(X^{(j)}=x^{(j)}|Y=c_{k})$