【卷積網絡模型系列】輕量級卷積網絡SqueezeNet的介紹與實現(Pytorch,Tensorflow)

1、背景介紹

在2012年AlexNet問世覺得,卷積神經網絡在圖像分類識別,目標檢測,圖像分割等方面獲得普遍應用,後續大牛們也提出了不少更優越的模型,比 如VGG, GoogLeNet系列,ResNet, DenseNet等。bash

伴隨着精度的提高,對應模型的深度也隨着增長,從AlexNet的7層,到16 層 VGG,再到GoogLeNet 的 22 層,再到 152 層 ResNet,更有上千層的 ResNet 和 DenseNet,從而帶來了效率問題。所以,後面又提出來了在保持必定精度的前提下的輕量級卷積網路架構,SqueezeNet就是其中之一。服務器


對於相同的正確率水平,輕量級的CNN架構能夠提供以下的優點:網絡

1. 在分佈式訓練中,與服務器通訊需求更小架構

2. 參數更少,從雲端下載模型的數據量小機器學習

3. 更適合在FPGA和嵌入式硬件設備上部署。分佈式

SqeezeNet在ImageNet上實現了和AlexNet相同的正確率,可是隻使用了1/50的參數。更進一步,使用模型壓縮技術,能夠將SqueezeNet壓縮到0.5MB,這是AlexNet的1/510。ide


2、SqueezeNet介紹

SqueezeNet所作的主要工做以下:學習

1. 提出了新的網絡架構Fire Module,經過減小參數來進行模型壓縮ui

2. 使用其它方法對提出的SqeezeNet模型進行進一步壓縮spa

3. 對參數空間進行了探索,主要研究了壓縮比和3×3卷積比例的影響

SqueezeNet中提出了 Fire Module結構做爲網絡的基礎模塊,具體結構以下圖:

Fire module 由兩層構成,分別是 squeeze 層+expand 層,如上圖所示,squeeze 層是一個 1*1 卷積核的卷積層,expand 層是 1*1 和 3*3 卷積核的卷積層,expand 層中,把 1*1 和 3*3 獲得的 feature map 進行 concat。

具體操做以下圖所示:

s1 是 Squeeze層 1*1卷積核的數量, e1 是Expand層1*1卷積核的數量, e3是Expand層3*3卷積核的數量,在文中提出的 SqueezeNet 結構中,e1=e3=4s1。

SqueezeNet網絡的總體結構以下圖:


SqueezeNet以卷積層(conv1)開始,接着使用8個Fire modules (fire2-9),最後以卷積層(conv10)結束。每一個fire module中的filter數量逐漸增長,而且在conv1, fire4, fire8, 和 conv10這幾層以後使用步幅爲2的最大池化。

模型具體參數狀況以下圖:

在Imagenet數據集上,和AlexNet對好比下圖:

能夠看到,在統一不使用模型壓縮的狀況下,模型大小相比於AlexNet,縮小了50倍,可是精度卻和AlexNet同樣。


3、SqueezeNet具體實現(Pytorch, Tensorflow)

1.Pytorch實現​​

import torch
import torch.nn as nn
from torchvision.models import squeezenet1_0
from torchvision import transforms
from PIL import Image


class Fire(nn.Module):
    def __init__(self, in_channels, squeeze_channels, expand1x1_channels, expand3x3_channels):
        super(Fire, self).__init__()

        self.squeeze = nn.Conv2d(in_channels, squeeze_channels, kernel_size=1)
        self.squeeze_activation = nn.ReLU(inplace=True)

        self.expand1x1 = nn.Conv2d(squeeze_channels, expand1x1_channels, kernel_size=1)
        self.expand1x1_activation = nn.ReLU(inplace=True)

        self.expand3x3 = nn.Conv2d(squeeze_channels, expand3x3_channels, kernel_size=3, padding=1)
        self.expand3x3_activation = nn.ReLU(inplace=True)

    def forward(self, X):
        X = self.squeeze_activation(self.squeeze(X))
        X = torch.cat([
            self.expand1x1_activation(self.expand1x1(X)),
            self.expand3x3_activation(self.expand3x3(X))
        ], dim=1)

        return X



class SqueezeNet(nn.Module):
    def __init__(self):
        super(SqueezeNet, self).__init__()

        self.features = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=96, kernel_size=7, stride=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True),
            Fire(96, 16, 64, 64),
            Fire(128, 16, 64, 64),
            Fire(128, 32, 128, 128),
            nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True),
            Fire(256, 32, 128, 128),
            Fire(256, 48, 192, 192),
            Fire(384, 48, 192, 192),
            Fire(384, 64, 256, 256),
            nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True),
            Fire(512, 64, 256, 256)
        )

        self.classifier = nn.Sequential(
            nn.Dropout(p=0.5),
            nn.Conv2d(512, 1000, kernel_size=1),   #輸出 13*13*1000
            nn.ReLU(inplace=True),
            nn.AdaptiveAvgPool2d((1, 1))  #輸出 1*1*1000
        )

    def forward(self, X):
        X = self.features(X)
        print(X.shape)
        X = self.classifier(X)
        return torch.flatten(X, 1)

#對圖像的預處理(固定尺寸到224, 轉換成touch數據, 歸一化)
tran = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
])

if __name__ == '__main__':
    image = Image.open("tiger.jpeg")
    image = tran(image)
    image = torch.unsqueeze(image, dim=0)

    net = SqueezeNet()
    # net = squeezenet1_0()
    for name, parameter in net.named_parameters():
        print("name={},size={}".format(name, parameter.size()))
    device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
    net = net.to(device)
    image = image.to(device)
    net.load_state_dict(torch.load("squeezenet1_0-a815701f.pth"))  # 加載pytorch中訓練好的模型參數
    net.eval()

    output = net(image)
    test, prop = torch.max(output, 1)
    synset = [l.strip() for l in open("synset.txt").readlines()]
    print("top1:", synset[prop.item()])

    preb_index = torch.argsort(output, dim=1, descending=True)[0]
    top5 = [(synset[preb_index[i]], output[0][preb_index[i]].item()) for i in range(5)]
    print(("Top5: ", top5))



複製代碼

2.TensorFlow實現

import tensorflow as tf
import math
import numpy as np
from tensorflow.contrib.layers import conv2d, avg_pool2d, max_pool2d

class SqueezeNet():
    def __init__(self, parameter_path=None):
        if parameter_path:
            self.parameter_dict = np.load(parameter_path, encoding="latin1").item()
        else:
            self.parameter_dict = {}
        self.is_training = True

    def set_training(self, is_training):
        self.is_training = is_training

    def bulid(self, image):
        RGB_MEAN = [103.939, 116.779, 123.68]
        with tf.variable_scope("preprocess"):
            mean = tf.constant(value=RGB_MEAN, dtype=tf.float32, shape=[1, 1, 1, 3], name="preprocess_mean")
            image = image - mean

        self.conv1 = self._conv_layer(image, stride=2, filter_size=7, in_channels=3, out_channels=96, name="conv1") #112
        self.conv1_relu = tf.nn.relu(self.conv1)
        self.maxpool1 = self._max_pool(self.conv1_relu, filter_size=3, stride=2)   #56

        self.Fire2 = self._Fire(self.maxpool1, 96, 16, 64, 64, name="Fire2_")
        self.Fire3 = self._Fire(self.Fire2, 128, 16, 64, 64, name="Fire3_")
        self.Fire4 = self._Fire(self.Fire3, 128, 32, 128, 128, name="Fire4_")

        self.maxpool2 = self._max_pool(self.Fire4, filter_size=3, stride=2, padding="VALID")  #27

        self.Fire5 = self._Fire(self.maxpool2, 256, 32, 128, 128, name="Fire5_")
        self.Fire6 = self._Fire(self.Fire5, 256, 48, 192, 192, name="Fire6_")
        self.Fire7 = self._Fire(self.Fire6, 384, 48, 192, 192, name="Fire7_")
        self.Fire8 = self._Fire(self.Fire7, 384, 64, 256, 256, name="Fire8_")

        self.maxpool3 = self._max_pool(self.Fire8, filter_size=3, stride=2, padding="VALID")  #13

        self.Fire9 = self._Fire(self.maxpool3, 512, 54, 256, 256, name="Fire9_")
        # self.droup = tf.nn.dropout(self.Fire9, keep_prob=0.5)
        self.conv10 = self._conv_layer(self.Fire9, stride=1, filter_size=1, in_channels=512, out_channels=10,
                                       name="conv10")

        print("self.conv10.get_shape()={}".format(self.conv10.get_shape()))
        self.avgpool = self._avg_pool(self.conv10, filter_size=13, stride=1)
        print("self.avgpool.get_shape()={}".format(self.avgpool.get_shape()))
        return tf.squeeze(self.avgpool, [1, 2])




    def _Fire(self, input, in_channels, squeeze_channels, expand1x1_channels, expand3x3_channels, name):
        self.squeeze_conv = self._conv_layer(input, stride=1, filter_size=1,
                                             in_channels=in_channels, out_channels=squeeze_channels,
                                             name=name+"squeeze_conv")
        self.squeeze_conv_relu = tf.nn.relu(self.squeeze_conv)

        self.expand1x1_conv = self._conv_layer(self.squeeze_conv_relu, stride=1, filter_size=1,
                                               in_channels=squeeze_channels, out_channels=expand1x1_channels,
                                               name=name+"expand1x1_conv")
        self.expand1x1_conv_relu = tf.nn.relu(self.expand1x1_conv)

        self.expand3x3_conv = self._conv_layer(self.squeeze_conv_relu, stride=1, filter_size=3,
                                               in_channels=squeeze_channels, out_channels=expand3x3_channels,
                                               name=name + "expand3x3_conv")
        self.expand3x3_conv_relu = tf.nn.relu(self.expand3x3_conv)

        return tf.concat([self.expand1x1_conv_relu, self.expand3x3_conv_relu], axis=3)



    def _batch_norm(self, input):
        return tf.layers.batch_normalization(inputs=input, axis=3, momentum=0.99,
                                             epsilon=1e-12, center=True, scale=True,
                                             training=self.is_training)

    def _avg_pool(self, input, filter_size, stride, padding="VALID"):
        return tf.nn.avg_pool(input, ksize=[1, filter_size, filter_size, 1],
                              strides=[1, stride, stride, 1], padding=padding)

    def _max_pool(self, input, filter_size, stride, padding="SAME"):
        return tf.nn.max_pool(input, ksize=[1, filter_size, filter_size, 1],
                              strides=[1, stride, stride, 1], padding=padding)

    def _conv_layer(self, input, stride, filter_size, in_channels, out_channels, name, padding="SAME"):
        ''' 定義卷積層 '''
        with tf.variable_scope(name):
            conv_filter, bias = self._get_conv_parameter(filter_size, in_channels, out_channels, name)
            conv = tf.nn.conv2d(input, filter=conv_filter, strides=[1, stride, stride, 1], padding=padding)
            conv_bias = tf.nn.bias_add(conv, bias)
            return conv_bias

    def _fc_layer(self, input, in_size, out_size, name):
        ''' 定義全鏈接層 '''
        with tf.variable_scope(name):
            input = tf.reshape(input, [-1, in_size])
            fc_weights, fc_bais = self._get_fc_parameter(in_size, out_size, name)
            fc = tf.nn.bias_add(tf.matmul(input, fc_weights), fc_bais)
            return fc

    def _get_conv_parameter(self, filter_size, in_channels, out_channels, name):
        ''' 用於獲取卷積層參數 :param filter_size: 卷積核大小 :param in_channel: 卷積核channel :param out_channel: 卷積輸出的channel,也就是卷積核個數 :param name: 當前卷積層name :return: 返回對應卷積核 和 偏置 '''
        if name in self.parameter_dict:
            conv_filter_initValue = self.parameter_dict[name][0];
            bias_initValue = self.parameter_dict[name][1]
            conv_filter_value = tf.Variable(initial_value=conv_filter_initValue, name=name + "_weights")
            bias = tf.Variable(initial_value=bias_initValue, name=name + "_biases")
        else:

            conv_filter_value = tf.get_variable(name=name+"_weights",
                                                shape=[filter_size, filter_size, in_channels, out_channels],
                                                initializer=tf.contrib.keras.initializers.he_normal())
            bias = tf.get_variable(name=name+"_biases", shape=[out_channels],
                                   initializer=tf.constant_initializer(0.1, dtype=tf.float32))


        return conv_filter_value, bias

    def _get_fc_parameter(self, in_size, out_size, name):
        ''' 用於獲取全鏈接層參數 :param in_size: :param out_size: :param name: :return: '''
        if name in self.parameter_dict:
            fc_weights_initValue = self.parameter_dict[name][0]
            fc_bias_initValue = self.parameter_dict[name][1]
            fc_weights = tf.Variable(initial_value=fc_weights_initValue, name=name + "_weights")
            fc_bias = tf.Variable(initial_value=fc_bias_initValue, name=name + "_biases")
        else:
            fc_weights = tf.get_variable(name=name + "_weights",
                                                shape=[in_size, out_size],
                                                initializer=tf.contrib.keras.initializers.he_normal())
            fc_bias = tf.get_variable(name=name + "_biases", shape=[out_size],
                                   initializer=tf.constant_initializer(0.1, dtype=tf.float32))

        return fc_weights, fc_bias

if __name__ == '__main__':
    input = tf.placeholder(dtype=tf.float32, shape=[1, 224, 224, 3], name="input")
    resnet = SqueezeNet()
    out_put = resnet.bulid(input)
    print(out_put.get_shape())複製代碼

相關完整代碼以及pytorch訓練好的模型參數百度網盤下載,請關注個人公衆號 AI計算機視覺工坊,回覆【代碼】獲取。本公衆號不按期推送機器學習,深度學習,計算機視覺等相關文章,歡迎你們和我一塊兒學習,交流

​​

相關文章
相關標籤/搜索