看了宗成慶博士的《統計天然語言處理(中文信息處理)》的第六章,對維特比算法有着很是精闢的講解。把其中的講解上傳上來,我的感受比較正統。java
今天用Java實現了這個算法,也能夠轉換爲C代碼:python
package com.nlp.hmm.algorithm.viterbi; /** * 維特比算法 * * @author XueQiang Tong * @date 2017/04/25 */ public class Viterbi { /** * @param obs * 觀察輸出序列 * @param state * 狀態序列(隱式) * @param start_p * 初始機率 * @param trans_p * 狀態轉換機率 * @param emit_p * 發射機率 * @return 返回最優狀態路徑(這個路徑具備最好的解釋能力) */ public static int[] compute(int obs[], int state[], double[] start_p, double[][] trans_p, double[][] emit_p) { double v[][] = new double[obs.length][state.length];// viterbi值矩陣 int path[][] = new int[state.length][obs.length];// 存儲路徑 // 1.初始化時間序列爲0的節點( t=0) for (int s : state) { v[obs[0]][s] = start_p[s] * emit_p[s][obs[0]]; path[s][obs[0]] = s; } // 2.開始遞歸計算viterbi矩陣(從t=1開始) for (int t = 1; t < obs.length; ++t) { int newpath[][] = new int[state.length][obs.length];// 每次迭代時,定義一個新的存儲path,由於每一個時間序列的path都依賴於 // 上個序列的path // 對該時間序列上的每一個state節點進行迭代,每一個節點都與上個序列的全部state交互 for (int s : state) { double maxValue = -1.0; int label = -1; for (int s0 : state) {// 尋找出備選路徑 double viterbiValue = v[t - 1][s0] * trans_p[s0][s] * emit_p[s][obs[t]]; if (viterbiValue > maxValue) { maxValue = viterbiValue; label = s0; // 記錄最大機率 v[t][s] = maxValue; // 記錄路徑 System.arraycopy(path[label], 0, newpath[s], 0, t); newpath[s][t] = s; } } } path = newpath; } // 找出最後一個時間序列上的viterbi值最大的state,從而找出最優路徑 double maxValue = -1.0; int label = -1; for (int s : state) { if (v[obs.length - 1][s] > maxValue) { maxValue = v[obs.length - 1][s]; label = s; } } return path[label]; } }
測試文件:算法
1 package com.nlp.hmm.algorithm.viterbi; 2 3 4 public class ViterbiTest { 5 6 public static void main(String[] args) { 7 int[] obs = {0,1,2};//0--walk,1--shop,2--clean 8 int[] states = {0,1};//0--rainy,1--sunny 9 double[] start_p = {0.6,0.4};//0.6--rainy,0.4--sunny 10 double[][] trans_p = new double[states.length][states.length];//0--rainy,1--sunny 11 trans_p[0][0] = 0.7; 12 trans_p[0][1] = 0.3; 13 trans_p[1][0] = 0.4; 14 trans_p[1][1] = 0.6; 15 double[][] emit_p = new double[states.length][obs.length];//dimension1:rainy 0,sunny 1;dimension2:walk 0 shop 1 clean 2 16 emit_p[0][0] = 0.1; 17 emit_p[0][1] = 0.4; 18 emit_p[0][2] = 0.5; 19 emit_p[1][0] = 0.6; 20 emit_p[1][1] = 0.3; 21 emit_p[1][2] = 0.1; 22 int []path = Viterbi.compute(obs, states, start_p, trans_p, emit_p); 23 for (int pa:path){ 24 System.out.print(pa+" "); 25 } 26 } 27 28 }
輸出結果爲1,0,0,就是sunny,rainy,rainy。對比了一下,結果沒問題,邏輯也沒問題。維特比算法和前向搜索算法有着共同點,二者解決的hmm問題不同。測試
下面是本人的python實現代碼:spa
import numpy as np def viterbi(obs,state,start_prob,transition_prob,emit_prob): vit = np.array(np.zeros(shape = [len(obs),len(state)],dtype = np.float64)) path = np.array(np.zeros(shape = [len(state),len(obs)],dtype = np.int32)) #初始化 for i in range(len(state)): vit[0,i] = start_prob[i] * emit_prob[i][0] path[i][0] = state[i] for t in range(1,len(obs)): newPath = np.zeros_like(path) v = np.dot(np.reshape(vit[t-1],[len(state),1]),np.reshape(emit_prob[:,t],[1,len(state)])) * transition_prob vi = np.max(v,axis = 0) vit[t,:] = vi[:] pa = np.argmax(v,axis = 0) for i in range(len(pa)): label = pa[i] newPath[i,:] = path[label,:] newPath[i][t] = state[i] path = newPath return path[np.argmax(vit[len(obs)-1,:]),:]