論文筆記——Rethinking the Inception Architecture for Computer Vision

1. 論文思想

  • factorized convolutions and aggressive regularization.
  • 本文給出了一些網絡設計的技巧。

2. 結果

  • 用5G的計算量和25M的參數。With an ensemble of 4 models and multi-crop evaluation, we report 3.5% top-5 error and 17.3% top-1 error.

3. Introduction

  • scaling up convolution network in efficient ways.

4. General Design Principles

  1. Avoid representational bottlenecks, especially early in the network.(簡單說就是feature map的大小要慢慢的減少。)網絡

  2. Higher dimensional representations are easier to process locally within a network. Increasing the activations per tile in a convolutional network allows for more disentangled features. The resulting networks will train faster.(在網絡較深層應該利用更多的feature map,有利於容納更多的分解特徵。這樣能夠加速訓練)性能

  3. Spatial aggregation can be done over lower dimensional embeddings without much or any loss in representational power.(也就是bottleneck layer的設計)lua

  4. Balance the width and depth of the network.(Increasing both the width and the depth of the network can contribute to higher quality networks.同時增長網絡的深度和寬度)spa

5. Factorizing Convolution With Large Filter Size

  • 分解較大filter size的卷積。

5.1. Factorization into smaller convolutions

  • 一個5x5的卷積能夠分解爲兩個3x3的卷積。

  • 實驗代表,將一個卷積分解爲兩個卷積的時候,在第一個卷積以後利用ReLU會提高準確率。也就是說線性分解性能會差一些。

5.2 Spatial Factorization into Asymmetric Convolutions

  • 將3x3的卷積分解成31和13的卷積,能夠減小33%計算量,若是將3x3分解爲兩個2x2,能夠減小11%計算量,並且利用非對稱卷積的效果還更好。
  • 實踐代表,不要過早的使用這種分解操做,在feature map 大小爲(12 ~ 20)之間,使用它,效果是比較好的。

6. Utility of Auxiliary Classifier

7. Efficient Grid Size Reduction

  • 左邊引入了 representational bottleneck,右邊的會增長大量的計算量,最佳的作法就是減小feature map大小的同時增大channel的數目。

  • 以上纔是正確的方式。