Vision_layer裏面主要是包括了一些關於一些視覺上的操做,好比卷積、反捲積、池化等等。這裏的類跟data layer同樣好不少種繼承關係。主要包括了這幾個類,其中CuDNN分別是CUDA版本,這裏先不討論,在這裏先討論ConvolutionLayeride
其繼承自Layer,是一個卷積以及反捲積操做的基類,首先咱們來看BaseConvolutionLayer的LayerSetUp函數函數
void BaseConvolutionLayer<Dtype>::LayerSetUp(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) //首先這裏主要是在配置卷積kernel 的size,padding,stride以及inputs ConvolutionParameter conv_param = this->layer_param_.convolution_param(); force_nd_im2col_ = conv_param.force_nd_im2col(); channel_axis_ = bottom[0]->CanonicalAxisIndex(conv_param.axis()); const int first_spatial_axis = channel_axis_ + 1; const int num_axes = bottom[0]->num_axes(); num_spatial_axes_ = num_axes - first_spatial_axis; CHECK_GE(num_spatial_axes_, 0); vector<int> bottom_dim_blob_shape(1, num_spatial_axes_ + 1); vector<int> spatial_dim_blob_shape(1, std::max(num_spatial_axes_, 1)); // 設置kernel的dimensions kernel_shape_.Reshape(spatial_dim_blob_shape); int* kernel_shape_data = kernel_shape_.mutable_cpu_data();
接着是設置相應的stride dimensions,對於2D,設置在h和w方向上的stride,代碼太長列出簡要的優化
pad_.Reshape(spatial_dim_blob_shape); int* pad_data = pad_.mutable_cpu_data(); pad_data[0] = conv_param.pad_h(); pad_data[1] = conv_param.pad_w(); ......一堆if else判斷
對於kernel的pad也作相應設置this
pad_.Reshape(spatial_dim_blob_shape); int* pad_data = pad_.mutable_cpu_data(); pad_data[0] = conv_param.pad_h(); pad_data[1] = conv_param.pad_w();
接下來是對widhts 和bias左設置和填充,其中blob[0]裏面存放的是filter weights,而blob[1]裏面存放的是biases,固然biases是可選的,也能夠沒有spa
//設置相應的shape,並檢查 vector<int> weight_shape(2); weight_shape[0] = conv_out_channels_; weight_shape[1] = conv_in_channels_ / group_; bias_term_ = this->layer_param_.convolution_param().bias_term(); vector<int> bias_shape(bias_term_, num_output_); //填充權重 this->blobs_[0].reset(new Blob<Dtype>(weight_shape)); shared_ptr<Filler<Dtype> > weight_filler(GetFiller<Dtype>( this->layer_param_.convolution_param().weight_filler())); weight_filler->Fill(this->blobs_[0].get()); //填充偏置項 if (bias_term_) { this->blobs_[1].reset(new Blob<Dtype>(bias_shape)); shared_ptr<Filler<Dtype> > bias_filler(GetFiller<Dtype>( this->layer_param_.convolution_param().bias_filler())); bias_filler->Fill(this->blobs_[1].get()); }
ConvolutionLayer繼承了BaseConvolutionLayer,主要做用就是將一副image作卷積操做,使用學到的filter的參數和biaes。同時在Caffe裏面,卷積操做作了優化,變成了一個矩陣相乘的操做。其中有兩個比較主要的函數是im2col以及col2im。
圖中上半部分是一個傳統卷積,下圖是一個矩陣相乘的版本
code
下圖是在一個卷積層中將卷積操做展開的具體操做過程,他裏面按照卷積核的大小取數據而後展開,在同一張圖裏的不一樣卷積核選取的逐行擺放,不一樣N的話,就在同一行後面繼續拼接,不一樣個能夠是多個通道,可是須要注意的是同一行裏面每一段都應該對應的是原圖中中一個位置的卷積窗口。
blog
對於卷積層中的卷積操做,還有一個group的概念要說明一下,groups是表明filter 組的個數。引入gruop主要是爲了選擇性的鏈接卷基層的輸入端和輸出端的channels,不然參數會太多。每個group 和1/ group的input 通道和 1/group 的output通道進行卷積操做。好比有4個input, 8個output,那麼1-4屬於第一組,5-8屬於第二個gruop繼承
ConvolutionLayer裏面,主要重寫了Forward_cpu和Backward_cpu接口
void ConvolutionLayer<Dtype>::Forward_cpu(const vector<Blob<Dtype>*>& bottom, const vector<Blob<Dtype>*>& top) { const Dtype* weight = this->blobs_[0]->cpu_data(); for (int i = 0; i < bottom.size(); ++i) { const Dtype* bottom_data = bottom[i]->cpu_data(); Dtype* top_data = top[i]->mutable_cpu_data(); for (int n = 0; n < this->num_; ++n) { this->forward_cpu_gemm(bottom_data + n * this->bottom_dim_, weight, top_data + n * this->top_dim_); if (this->bias_term_) { const Dtype* bias = this->blobs_[1]->cpu_data(); this->forward_cpu_bias(top_data + n * this->top_dim_, bias); } } } }
能夠看到其實這裏面他調用了forward_cpu_gemm,而這個函數內部又調用了math_function裏面的caffe_cpu_gemm的通用矩陣相乘接口,GEMM的全稱是General Matrix Matrix Multiply。其基本形式以下:
\[C=alpha*op( A )*op( B ) + beta*C,\]ip
template <typename Dtype> void ConvolutionLayer<Dtype>::Backward_cpu(const vector<Blob<Dtype>*>& top, const vector<bool>& propagate_down, const vector<Blob<Dtype>*>& bottom) { //反向傳播梯度偏差 const Dtype* weight = this->blobs_[0]->cpu_data(); Dtype* weight_diff = this->blobs_[0]->mutable_cpu_diff(); for (int i = 0; i < top.size(); ++i) { const Dtype* top_diff = top[i]->cpu_diff(); const Dtype* bottom_data = bottom[i]->cpu_data(); Dtype* bottom_diff = bottom[i]->mutable_cpu_diff(); //若是有bias項,計算Bias導數 if (this->bias_term_ && this->param_propagate_down_[1]) { Dtype* bias_diff = this->blobs_[1]->mutable_cpu_diff(); for (int n = 0; n < this->num_; ++n) { this->backward_cpu_bias(bias_diff, top_diff + n * this->top_dim_); } } //計算weight if (this->param_propagate_down_[0] || propagate_down[i]) { for (int n = 0; n < this->num_; ++n) { // 計算weights權重的梯度 if (this->param_propagate_down_[0]) { this->weight_cpu_gemm(bottom_data + n * this->bottom_dim_, top_diff + n * this->top_dim_, weight_diff); } //計算botttom數據的梯度,下後傳遞 if (propagate_down[i]) { this->backward_cpu_gemm(top_diff + n * this->top_dim_, weight, bottom_diff + n * this->bottom_dim_); } } } } }