以下圖所示,設神經網絡的輸入爲\(x^n\),該輸入對應的label是\(\hat y^n\),神經網絡的參數是\(\theta\),神經網絡的輸出是\(y^n\)。git
整個神經網絡的Loss爲\(L(\theta)=\sum_{n=1}^{N}C^n(\theta)\)。假設\(\theta\)中有一個參數\(w\),那\(\frac{\partial L(\theta)}{\partial w}=\sum^N_{n=1}\frac{\partial C^n(\theta)}{\partial w}\)。github
以下圖所示,\(z=x_1w_1+x_2w_x+b\),根據鏈式法則可知\(\frac{\partial C}{\partial w}=\frac{\partial z}{\partial w}\frac{\partial C}{\partial z}\),其中爲全部參數\(w\)計算\(\frac{\partial z}{\partial w}\)是Forward Pass、爲全部激活函數的輸入\(z\)計算\(\frac{\partial C}{\partial z}\)是Backward Pass。算法
Forward Pass是爲全部參數\(w\)計算\(\frac{\partial z}{\partial w}\),它的方向是從前日後算的,因此叫Forward Pass。微信
以一個神經元爲例,由於\(z=x_1w_1+x_2w_x+b\),因此\(\frac{\partial z}{\partial w_1}=x_1,\frac{\partial z}{\partial w_2}=x_2\),以下圖所示。網絡
規律是:該權重乘以的那個輸入的值。因此當有多個神經元時,以下圖所示。函數
Backward Pass是爲全部激活函數的輸入\(z\)計算\(\frac{\partial C}{\partial z}\),它的方向是從後往前算的,要先算出輸出層的\(\frac{\partial C}{\partial z}\),再往前計算其它神經元的\(\frac{\partial C}{\partial z}\),因此叫Backward Pass。spa
如上圖所示,令\(a=\sigma(z)\),根據鏈式法則,可知\(\frac{\partial C}{\partial z}=\frac{\partial a}{\partial z}\frac{\partial C}{\partial a}\),其中\(\frac{\partial a}{\partial z}=\sigma'(z)\)是一個常數,由於在Forward Pass時\(z\)的值就已經肯定了,而\(\frac{\partial C}{\partial a}=\frac{\partial z'}{\partial a}\frac{\partial C}{\partial z'}+\frac{\partial z''}{\partial a}\frac{\partial C}{\partial z''}=w_3\frac{\partial C}{\partial z'}+w_4\frac{\partial C}{\partial z''}\),因此\(\frac{\partial C}{\partial z}=\sigma'(z)[w_3\frac{\partial C}{\partial z'}+w_4\frac{\partial C}{\partial z''}]\)。3d
對於式子\(\frac{\partial C}{\partial z}=\sigma'(z)[w_3\frac{\partial C}{\partial z'}+w_4\frac{\partial C}{\partial z''}]\),咱們能夠發現兩點:code
\(\frac{\partial C}{\partial z}\)的計算式是遞歸的,由於在計算\(\frac{\partial C}{\partial z}\)的時候須要計算\(\frac{\partial C}{\partial z'}\)和\(\frac{\partial C}{\partial z''}\)。blog
以下圖所示,輸出層的\(\frac{\partial C}{\partial z'}\)和\(\frac{\partial C}{\partial z''}\)是容易計算的。
\(\frac{\partial C}{\partial z}\)的計算式\(\frac{\partial C}{\partial z}=\sigma'(z)[w_3\frac{\partial C}{\partial z'}+w_4\frac{\partial C}{\partial z''}]\)是一個神經元的形式
以下圖所示,只不過沒有嵌套sigmoid函數而是乘以一個常數\(\sigma'(z)\),每一個\(\frac{\partial C}{\partial z}\)都是一個神經元的形式,因此能夠經過神經網絡計算\(\frac{\partial C}{\partial z}\)。
Github(github.com):@chouxianyu
Github Pages(github.io):@臭鹹魚
知乎(zhihu.com):@臭鹹魚
博客園(cnblogs.com):@臭鹹魚
B站(bilibili.com):@絕版臭鹹魚
微信公衆號:@臭鹹魚的快樂生活
轉載請註明出處,歡迎討論和交流!