移動端神經網絡MobileNet系列論文解讀與簡單代碼實現（MobileNetv1）

時間 2019-11-13

標籤移動神經網絡 mobilenet 系列論文解讀簡單代碼實現 mobilenetv1 mobilenetv 欄目無線简体版

原文原文鏈接

MobileNetv1：

針對移動端以及嵌入式視覺的應用提出了一類有效的模型叫MobileNets。MobileNets基於一種流線型結構使用深度可分離卷積來構造輕型權重深度神經網絡。python

MobileNet的核心部分也就是深度可分離卷積。而後描述描述MobileNet的網絡結構和兩個模型收縮超參數即寬度乘法器和分辨率乘法器。網絡

深度可分離卷積

MobileNet是一種基於深度可分離卷積的模型，深度可分離卷積是一種將標準卷積分解成深度卷積以及一個1x1的卷積即逐點卷積。對於MobileNet而言，深度卷積針對每一個單個輸入通道應用單個濾波器進行濾波，而後逐點卷積應用1x1的卷積操做來結合全部深度卷積獲得的輸出。而標準卷積一步即對全部的輸入進行結合獲得新的一系列輸出。深度可分離卷積將其分紅了兩步，針對每一個單獨層進行濾波而後下一步即結合。這種分解可以有效的大量減小計算量以及模型的大小。ide

而後作了計算複雜度上的對比，性能

標準卷積：spa

深度可分離卷積：3d

深度可分離卷積+逐點卷積：code

而後與標準卷積相比較：，在MobileNet使用3X3深度可分離卷積核的狀況下，計算量比標準的卷積減小了8~9倍。orm

網絡結構

寬度乘法器：更薄的模型

儘管最基本的MobileNet結構已經很是小而且低延遲。而不少時候特定的案例或者應用可能會要求模型變得更小更快。爲了構建這些更小而且計算量更小的模型，咱們引入了一種很是簡單的參數α叫作寬度乘法器。寬度乘法器αα的做用就是對每一層均勻薄化。給定一個層以及寬度乘法器α，輸入通道數M變成了αM而且輸出通道數變成αN。
計算：，這樣計算量進一步減小blog

分辨率乘法器：約化表達

第二個薄化神經網絡計算量的超參數是分辨率乘法器ρ。實際上，咱們經過設置ρ來隱式的設置輸入的分辨率大小。
計算： ip

論文的思想就是如上，設法減小計算量，其他部分是實驗驗證，驗證了在損失很小性能的狀況下，參數量大幅減小。

網絡實現：

# mobilenet_v1網絡定義
def mobilenet_v1(inputs, alpha, is_training):
    assert const.use_batch_norm == True
    # assert斷言是聲明其布爾值必須爲真的斷定，若是發生異常就說明表達示爲假
    # 縮小因子， 只能爲1，0.75，0.5，0.25
    if alpha not in [0.25, 0.50, 0.75, 1.0]:
        raise ValueError('alpha can be one of'
                         '`0.25`, `0.50`, `0.75` or `1.0` only.')
    filter_initializer = tf.contrib.layers.xavier_initializer()
    # 卷積，BN，RELU
    def _conv2d(inputs, filters, kernel_size, stride, scope=''):
        with tf.variable_scope(scope):
            outputs = tf.layers.conv2d(inputs, filters, kernel_size,
                                       strides=(stride, stride), padding='same',
                                       activation=None, use_bias=False,
                                       kernel_initializer=filter_initializer)
            # 非線性激活以前進行BN批標準化
            outputs = tf.layers.batch_normalization(outputs, training=is_training)
            outputs = tf.nn.relu(outputs)
        return outputs
    # 深度可分離卷積，標準卷積分解成深度卷積(depthwise convolution)和逐點卷積(pointwise convolution)
    def _depthwise_conv2d(inputs,
                          pointwise_conv_filters,
                          depthwise_conv_kernel_size,
                          stride,
                          scope=''):
        with tf.variable_scope(scope):
            with tf.variable_scope('depthwise_conv'):
                outputs = tf.contrib.layers.separable_conv2d(
                    inputs,
                    None,
                    depthwise_conv_kernel_size,
                    depth_multiplier=1,
                    stride=(stride, stride),
                    padding='SAME',
                    activation_fn=None,
                    weights_initializer=filter_initializer,
                    biases_initializer=None)
                outputs = tf.layers.batch_normalization(outputs, training=is_training)
                outputs = tf.nn.relu(outputs)
            with tf.variable_scope('pointwise_conv'):
                pointwise_conv_filters = int(pointwise_conv_filters * alpha)
                outputs = tf.layers.conv2d(outputs,
                                           pointwise_conv_filters,
                                           (1,1),
                                           padding='same',
                                           activation=None,
                                           use_bias=False,
                                           kernel_initializer=filter_initializer)
                outputs = tf.layers.batch_normalization(outputs, training=is_training)
                outputs = tf.nn.relu(outputs)

        return outputs
    # 平均池化
    def _avg_pool2d(inputs, scope=''):
        inputs_shape = inputs.get_shape().as_list()
        assert len(inputs_shape) == 4

        pool_height = inputs_shape[1]
        pool_width  = inputs_shape[2]

        with tf.variable_scope(scope):
            outputs = tf.layers.average_pooling2d(inputs,
                                                  [pool_height, pool_width],
                                                  strides=(1, 1),
                                                  padding='valid')
        return outputs

   
    with tf.variable_scope('mobilenet_v1', 'mobilenet_v1', [inputs]):
        end_points = {}
        net = inputs

        net = _conv2d(net, 32, [3,3], stride=2, scope='block0')
        end_points['block0'] = net
        net = _depthwise_conv2d(net, 64, [3, 3], stride=1, scope='block1')
        end_points['block1'] = net

        net = _depthwise_conv2d(net, 128, [3, 3], stride=2, scope='block2')
        end_points['block2'] = net
        net = _depthwise_conv2d(net, 128, [3, 3], stride=1, scope='block3')
        end_points['block3'] = net

        net = _depthwise_conv2d(net, 256, [3, 3], stride=2, scope='block4')
        end_points['block4'] = net
        net = _depthwise_conv2d(net, 256, [3, 3], stride=1, scope='block5')
        end_points['block5'] = net

        net = _depthwise_conv2d(net, 512, [3, 3], stride=2, scope='block6')
        end_points['block6'] = net
        net = _depthwise_conv2d(net, 512, [3, 3], stride=1, scope='block7')
        end_points['block7'] = net
        net = _depthwise_conv2d(net, 512, [3, 3], stride=1, scope='block8')
        end_points['block8'] = net
        net = _depthwise_conv2d(net, 512, [3, 3], stride=1, scope='block9')
        end_points['block9'] = net
        net = _depthwise_conv2d(net, 512, [3, 3], stride=1, scope='block10')
        end_points['block10'] = net
        net = _depthwise_conv2d(net, 512, [3, 3], stride=1, scope='block11')
        end_points['block11'] = net

        net = _depthwise_conv2d(net, 1024, [3, 3], stride=2, scope='block12')
        end_points['block12'] = net
        net = _depthwise_conv2d(net, 1024, [3, 3], stride=1, scope='block13')
        end_points['block13'] = net

        output = _avg_pool2d(net, scope='output')
    return output, end_points