用EM算法作系統辨識,問題描述:javascript
採集了一批輸入輸出數據 ,但不肯定各個樣本數據分別來自於兩個子模型中的哪個:
模型1: y=k1x+b1+v,
模型2: y=k2x+b2+w,
其中v和w分別爲服從均值爲0的正態分佈的白噪聲干擾項。試利用樣本數據,基於EM算法對模型1和模型2的參數進行辨識。
java
關於EM算法的理解能夠看這篇文章硬幣的例子https://blog.csdn.net/v_JULY_v/article/details/81708386算法
matlab源碼見個人另外一篇,也可之間在下方代碼複製。
https://download.csdn.net/download/weixin_42496224/13077074
1.數據生成
生成40%模型1和60%模型2的數據,並生成白噪聲。
spa
% 生成過程 % 白噪聲 x1 = randn(400,1); x2 = randn(600,1); % 數據生成 N = 1000; x = zeros(N,1); num_x1=1; num_x2=1; for i = 1 : N*0.4 x(i) = i; y(i) = x(i)+1+x1(i); end for i = 1:0.6*N x(i+400) = i+400; y(i+400) = 2*x(i+400)+3+x2(i); end
2.EM算法初始化
初始化中隨意選取k1,b1,k2,b2
x1_para表示k1,b1;
x2_para表示k2,b2。
.net
% 初始化參數 x1_para = [1 2]'; x2_para = [3 3]'; x1_M_calulate = []; x2_M_calulate = []; y1_M_calulate = []; y2_M_calulate = []; M1_num = 1; M2_num = 1; % z表示x(i)的類別 z=[];
3.循環
循環分爲E-step和M-step。
在E-step中,根據先前估計出的k1,b1,k2,b2分別計算出每一個點的y1和y2,比較|y-y1|和|y-y2|哪一個更小,小就表明當前點屬於該模型的機率更大。
在M-step中,因爲在前一步E-step中已經獲得了每一個點更有可能屬於的模型,將兩個模型的全部點做非線性最小二乘擬合,擬合出新的k1,b1,k2,b2。
繼續迭代,直至結束。
code
for o=1:100 % E-step M1_num=1; M2_num=1; clear x1_M_calulate; clear x2_M_calulate; clear y1_M_calulate; clear y2_M_calulate; x1_M_calulate = []; x2_M_calulate = []; y1_M_calulate = []; y2_M_calulate = []; for t=1:1000 compare1 = abs(y(t)-x1_para(1)*x(t)-x1_para(2)); compare2 = abs(y(t)-x2_para(1)*x(t)-x2_para(2)); if compare1<compare2 z(t)=1; x1_M_calulate(M1_num) = x(t); y1_M_calulate(M1_num) = y(t); M1_num = M1_num+1; else z(t)=2; x2_M_calulate(M2_num) = x(t); y2_M_calulate(M2_num) = y(t); M2_num = M2_num+1; end end % M-step a0=[1 1]; options=optimset('lsqnonlin'); p1=lsqnonlin(@fun,a0,[],[],options,x1_M_calulate',y1_M_calulate'); p2=lsqnonlin(@fun,a0,[],[],options,x2_M_calulate',y2_M_calulate'); x1_para(1) = p1(1); x1_para(2) = p1(2); x2_para(1) = p2(1); x2_para(2) = p2(2); end
完整代碼以下:blog
clear; clc; % 設40%爲y=x+1 % 60%爲y=2x+3; % 取1000個點; %% % 生成過程 % 白噪聲 x1 = randn(400,1); x2 = randn(600,1); % 數據生成 N = 1000; x = zeros(N,1); num_x1=1; num_x2=1; for i = 1 : N*0.4 x(i) = i; y(i) = x(i)+1+x1(i); end for i = 1:0.6*N x(i+400) = i+400; y(i+400) = 2*x(i+400)+3+x2(i); end %% % EM算法流程 % 初始化參數 x1_para = [1 2]'; x2_para = [3 3]'; x1_M_calulate = []; x2_M_calulate = []; y1_M_calulate = []; y2_M_calulate = []; M1_num = 1; M2_num = 1; % z表示x(i)的類別 z=[]; for o=1:100 % E-step M1_num=1; M2_num=1; clear x1_M_calulate; clear x2_M_calulate; clear y1_M_calulate; clear y2_M_calulate; x1_M_calulate = []; x2_M_calulate = []; y1_M_calulate = []; y2_M_calulate = []; for t=1:1000 compare1 = abs(y(t)-x1_para(1)*x(t)-x1_para(2)); compare2 = abs(y(t)-x2_para(1)*x(t)-x2_para(2)); if compare1<compare2 z(t)=1; x1_M_calulate(M1_num) = x(t); y1_M_calulate(M1_num) = y(t); M1_num = M1_num+1; else z(t)=2; x2_M_calulate(M2_num) = x(t); y2_M_calulate(M2_num) = y(t); M2_num = M2_num+1; end end % M-step a0=[1 1]; options=optimset('lsqnonlin'); p1=lsqnonlin(@fun,a0,[],[],options,x1_M_calulate',y1_M_calulate'); p2=lsqnonlin(@fun,a0,[],[],options,x2_M_calulate',y2_M_calulate'); x1_para(1) = p1(1); x1_para(2) = p1(2); x2_para(1) = p2(1); x2_para(2) = p2(2); end x1_para x2_para
另外建立一個文件命名fun.mip
function E=fun(a,x,y) x=x(:); y=y(:); Y=a(1)*x+a(2); E=y-Y; %M文件結束