在2012年AlexNet問世覺得,卷積神經網絡在圖像分類識別,目標檢測,圖像分割等方面獲得普遍應用,後續大牛們也提出了不少更優越的模型,比 如VGG, GoogLeNet系列,ResNet, DenseNet等。bash
伴隨着精度的提高,對應模型的深度也隨着增長,從AlexNet的7層,到16 層 VGG,再到GoogLeNet 的 22 層,再到 152 層 ResNet,更有上千層的 ResNet 和 DenseNet,從而帶來了效率問題。所以,後面又提出來了在保持必定精度的前提下的輕量級卷積網路架構,SqueezeNet就是其中之一。服務器
對於相同的正確率水平,輕量級的CNN架構能夠提供以下的優點:網絡
1. 在分佈式訓練中,與服務器通訊需求更小架構
2. 參數更少,從雲端下載模型的數據量小機器學習
3. 更適合在FPGA和嵌入式硬件設備上部署。分佈式
SqeezeNet在ImageNet上實現了和AlexNet相同的正確率,可是隻使用了1/50的參數。更進一步,使用模型壓縮技術,能夠將SqueezeNet壓縮到0.5MB,這是AlexNet的1/510。ide
SqueezeNet所作的主要工做以下:學習
1. 提出了新的網絡架構Fire Module,經過減小參數來進行模型壓縮ui
2. 使用其它方法對提出的SqeezeNet模型進行進一步壓縮spa
3. 對參數空間進行了探索,主要研究了壓縮比和3×3卷積比例的影響
SqueezeNet中提出了 Fire Module結構做爲網絡的基礎模塊,具體結構以下圖:
Fire module 由兩層構成,分別是 squeeze 層+expand 層,如上圖所示,squeeze 層是一個 1*1 卷積核的卷積層,expand 層是 1*1 和 3*3 卷積核的卷積層,expand 層中,把 1*1 和 3*3 獲得的 feature map 進行 concat。
具體操做以下圖所示:
s1 是 Squeeze層 1*1卷積核的數量, e1 是Expand層1*1卷積核的數量, e3是Expand層3*3卷積核的數量,在文中提出的 SqueezeNet 結構中,e1=e3=4s1。
SqueezeNet網絡的總體結構以下圖:
SqueezeNet以卷積層(conv1)開始,接着使用8個Fire modules (fire2-9),最後以卷積層(conv10)結束。每一個fire module中的filter數量逐漸增長,而且在conv1, fire4, fire8, 和 conv10這幾層以後使用步幅爲2的最大池化。
模型具體參數狀況以下圖:
在Imagenet數據集上,和AlexNet對好比下圖:
能夠看到,在統一不使用模型壓縮的狀況下,模型大小相比於AlexNet,縮小了50倍,可是精度卻和AlexNet同樣。
import torch
import torch.nn as nn
from torchvision.models import squeezenet1_0
from torchvision import transforms
from PIL import Image
class Fire(nn.Module):
def __init__(self, in_channels, squeeze_channels, expand1x1_channels, expand3x3_channels):
super(Fire, self).__init__()
self.squeeze = nn.Conv2d(in_channels, squeeze_channels, kernel_size=1)
self.squeeze_activation = nn.ReLU(inplace=True)
self.expand1x1 = nn.Conv2d(squeeze_channels, expand1x1_channels, kernel_size=1)
self.expand1x1_activation = nn.ReLU(inplace=True)
self.expand3x3 = nn.Conv2d(squeeze_channels, expand3x3_channels, kernel_size=3, padding=1)
self.expand3x3_activation = nn.ReLU(inplace=True)
def forward(self, X):
X = self.squeeze_activation(self.squeeze(X))
X = torch.cat([
self.expand1x1_activation(self.expand1x1(X)),
self.expand3x3_activation(self.expand3x3(X))
], dim=1)
return X
class SqueezeNet(nn.Module):
def __init__(self):
super(SqueezeNet, self).__init__()
self.features = nn.Sequential(
nn.Conv2d(in_channels=3, out_channels=96, kernel_size=7, stride=2),
nn.ReLU(inplace=True),
nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True),
Fire(96, 16, 64, 64),
Fire(128, 16, 64, 64),
Fire(128, 32, 128, 128),
nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True),
Fire(256, 32, 128, 128),
Fire(256, 48, 192, 192),
Fire(384, 48, 192, 192),
Fire(384, 64, 256, 256),
nn.MaxPool2d(kernel_size=3, stride=2, ceil_mode=True),
Fire(512, 64, 256, 256)
)
self.classifier = nn.Sequential(
nn.Dropout(p=0.5),
nn.Conv2d(512, 1000, kernel_size=1), #輸出 13*13*1000
nn.ReLU(inplace=True),
nn.AdaptiveAvgPool2d((1, 1)) #輸出 1*1*1000
)
def forward(self, X):
X = self.features(X)
print(X.shape)
X = self.classifier(X)
return torch.flatten(X, 1)
#對圖像的預處理(固定尺寸到224, 轉換成touch數據, 歸一化)
tran = transforms.Compose([
transforms.Resize((224,224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
if __name__ == '__main__':
image = Image.open("tiger.jpeg")
image = tran(image)
image = torch.unsqueeze(image, dim=0)
net = SqueezeNet()
# net = squeezenet1_0()
for name, parameter in net.named_parameters():
print("name={},size={}".format(name, parameter.size()))
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
net = net.to(device)
image = image.to(device)
net.load_state_dict(torch.load("squeezenet1_0-a815701f.pth")) # 加載pytorch中訓練好的模型參數
net.eval()
output = net(image)
test, prop = torch.max(output, 1)
synset = [l.strip() for l in open("synset.txt").readlines()]
print("top1:", synset[prop.item()])
preb_index = torch.argsort(output, dim=1, descending=True)[0]
top5 = [(synset[preb_index[i]], output[0][preb_index[i]].item()) for i in range(5)]
print(("Top5: ", top5))
複製代碼
import tensorflow as tf
import math
import numpy as np
from tensorflow.contrib.layers import conv2d, avg_pool2d, max_pool2d
class SqueezeNet():
def __init__(self, parameter_path=None):
if parameter_path:
self.parameter_dict = np.load(parameter_path, encoding="latin1").item()
else:
self.parameter_dict = {}
self.is_training = True
def set_training(self, is_training):
self.is_training = is_training
def bulid(self, image):
RGB_MEAN = [103.939, 116.779, 123.68]
with tf.variable_scope("preprocess"):
mean = tf.constant(value=RGB_MEAN, dtype=tf.float32, shape=[1, 1, 1, 3], name="preprocess_mean")
image = image - mean
self.conv1 = self._conv_layer(image, stride=2, filter_size=7, in_channels=3, out_channels=96, name="conv1") #112
self.conv1_relu = tf.nn.relu(self.conv1)
self.maxpool1 = self._max_pool(self.conv1_relu, filter_size=3, stride=2) #56
self.Fire2 = self._Fire(self.maxpool1, 96, 16, 64, 64, name="Fire2_")
self.Fire3 = self._Fire(self.Fire2, 128, 16, 64, 64, name="Fire3_")
self.Fire4 = self._Fire(self.Fire3, 128, 32, 128, 128, name="Fire4_")
self.maxpool2 = self._max_pool(self.Fire4, filter_size=3, stride=2, padding="VALID") #27
self.Fire5 = self._Fire(self.maxpool2, 256, 32, 128, 128, name="Fire5_")
self.Fire6 = self._Fire(self.Fire5, 256, 48, 192, 192, name="Fire6_")
self.Fire7 = self._Fire(self.Fire6, 384, 48, 192, 192, name="Fire7_")
self.Fire8 = self._Fire(self.Fire7, 384, 64, 256, 256, name="Fire8_")
self.maxpool3 = self._max_pool(self.Fire8, filter_size=3, stride=2, padding="VALID") #13
self.Fire9 = self._Fire(self.maxpool3, 512, 54, 256, 256, name="Fire9_")
# self.droup = tf.nn.dropout(self.Fire9, keep_prob=0.5)
self.conv10 = self._conv_layer(self.Fire9, stride=1, filter_size=1, in_channels=512, out_channels=10,
name="conv10")
print("self.conv10.get_shape()={}".format(self.conv10.get_shape()))
self.avgpool = self._avg_pool(self.conv10, filter_size=13, stride=1)
print("self.avgpool.get_shape()={}".format(self.avgpool.get_shape()))
return tf.squeeze(self.avgpool, [1, 2])
def _Fire(self, input, in_channels, squeeze_channels, expand1x1_channels, expand3x3_channels, name):
self.squeeze_conv = self._conv_layer(input, stride=1, filter_size=1,
in_channels=in_channels, out_channels=squeeze_channels,
name=name+"squeeze_conv")
self.squeeze_conv_relu = tf.nn.relu(self.squeeze_conv)
self.expand1x1_conv = self._conv_layer(self.squeeze_conv_relu, stride=1, filter_size=1,
in_channels=squeeze_channels, out_channels=expand1x1_channels,
name=name+"expand1x1_conv")
self.expand1x1_conv_relu = tf.nn.relu(self.expand1x1_conv)
self.expand3x3_conv = self._conv_layer(self.squeeze_conv_relu, stride=1, filter_size=3,
in_channels=squeeze_channels, out_channels=expand3x3_channels,
name=name + "expand3x3_conv")
self.expand3x3_conv_relu = tf.nn.relu(self.expand3x3_conv)
return tf.concat([self.expand1x1_conv_relu, self.expand3x3_conv_relu], axis=3)
def _batch_norm(self, input):
return tf.layers.batch_normalization(inputs=input, axis=3, momentum=0.99,
epsilon=1e-12, center=True, scale=True,
training=self.is_training)
def _avg_pool(self, input, filter_size, stride, padding="VALID"):
return tf.nn.avg_pool(input, ksize=[1, filter_size, filter_size, 1],
strides=[1, stride, stride, 1], padding=padding)
def _max_pool(self, input, filter_size, stride, padding="SAME"):
return tf.nn.max_pool(input, ksize=[1, filter_size, filter_size, 1],
strides=[1, stride, stride, 1], padding=padding)
def _conv_layer(self, input, stride, filter_size, in_channels, out_channels, name, padding="SAME"):
''' 定義卷積層 '''
with tf.variable_scope(name):
conv_filter, bias = self._get_conv_parameter(filter_size, in_channels, out_channels, name)
conv = tf.nn.conv2d(input, filter=conv_filter, strides=[1, stride, stride, 1], padding=padding)
conv_bias = tf.nn.bias_add(conv, bias)
return conv_bias
def _fc_layer(self, input, in_size, out_size, name):
''' 定義全鏈接層 '''
with tf.variable_scope(name):
input = tf.reshape(input, [-1, in_size])
fc_weights, fc_bais = self._get_fc_parameter(in_size, out_size, name)
fc = tf.nn.bias_add(tf.matmul(input, fc_weights), fc_bais)
return fc
def _get_conv_parameter(self, filter_size, in_channels, out_channels, name):
''' 用於獲取卷積層參數 :param filter_size: 卷積核大小 :param in_channel: 卷積核channel :param out_channel: 卷積輸出的channel,也就是卷積核個數 :param name: 當前卷積層name :return: 返回對應卷積核 和 偏置 '''
if name in self.parameter_dict:
conv_filter_initValue = self.parameter_dict[name][0];
bias_initValue = self.parameter_dict[name][1]
conv_filter_value = tf.Variable(initial_value=conv_filter_initValue, name=name + "_weights")
bias = tf.Variable(initial_value=bias_initValue, name=name + "_biases")
else:
conv_filter_value = tf.get_variable(name=name+"_weights",
shape=[filter_size, filter_size, in_channels, out_channels],
initializer=tf.contrib.keras.initializers.he_normal())
bias = tf.get_variable(name=name+"_biases", shape=[out_channels],
initializer=tf.constant_initializer(0.1, dtype=tf.float32))
return conv_filter_value, bias
def _get_fc_parameter(self, in_size, out_size, name):
''' 用於獲取全鏈接層參數 :param in_size: :param out_size: :param name: :return: '''
if name in self.parameter_dict:
fc_weights_initValue = self.parameter_dict[name][0]
fc_bias_initValue = self.parameter_dict[name][1]
fc_weights = tf.Variable(initial_value=fc_weights_initValue, name=name + "_weights")
fc_bias = tf.Variable(initial_value=fc_bias_initValue, name=name + "_biases")
else:
fc_weights = tf.get_variable(name=name + "_weights",
shape=[in_size, out_size],
initializer=tf.contrib.keras.initializers.he_normal())
fc_bias = tf.get_variable(name=name + "_biases", shape=[out_size],
initializer=tf.constant_initializer(0.1, dtype=tf.float32))
return fc_weights, fc_bias
if __name__ == '__main__':
input = tf.placeholder(dtype=tf.float32, shape=[1, 224, 224, 3], name="input")
resnet = SqueezeNet()
out_put = resnet.bulid(input)
print(out_put.get_shape())複製代碼
相關完整代碼以及pytorch訓練好的模型參數百度網盤下載,請關注個人公衆號 AI計算機視覺工坊,回覆【代碼】獲取。本公衆號不按期推送機器學習,深度學習,計算機視覺等相關文章,歡迎你們和我一塊兒學習,交流