從頭學pytorch(十四):lenet

卷積神經網絡

在以前的文章裏,對28 X 28的圖像,咱們是經過把它展開爲長度爲784的一維向量,而後送進全鏈接層,訓練出一個分類模型.這樣作主要有兩個問題html

  1. 圖像在同一列鄰近的像素在這個向量中可能相距較遠。它們構成的模式可能難以被模型識別。
  2. 對於大尺寸的輸入圖像,使用全鏈接層容易形成模型過大。假設輸入是高和寬均爲1000像素的彩色照片(含3個通道)。即便全鏈接層輸出個數還是256,該層權重參數的形狀是\(3,000,000\times 256\),按照參數爲float,佔用4字節計算,它佔用了大約3000000 X 256 X4bytes=3000000kb=3000M=3G的內存或顯存。

很顯然,經過使用卷積操做能夠有效的改善這兩個問題.關於卷積操做,池化操做等,參見置頂文章http://www.javashuo.com/article/p-qmobqjlc-kn.html網絡

LENET

lenet是比較早期提出來的一個神經網絡,其結構以下圖所示.
 app

LeNet的結構比較簡單,就是2次重複的卷積激活池化後面接三個全鏈接層.卷積層的卷積核用的5 X 5,池化用的窗口大小爲2 X 2,步幅爲2.
對咱們的輸入(28 x 28)來講,卷積層獲得的輸出shape爲[batch,16,4,4],在送入全鏈接層前,要reshape成[batch,16x4x4].能夠理解爲經過卷積,對沒一個樣本,咱們
都提取出來了16x4x4=256個特徵.這些特徵用來識別圖像裏的空間模式,好比線條和物體局部.ide

全鏈接層塊含3個全鏈接層。它們的輸出個數分別是120、84和10,其中10爲輸出的類別個數。函數

net0 = nn.Sequential(
        nn.Conv2d(1, 6, 5), # in_channels, out_channels, kernel_size
        nn.Sigmoid(),
        nn.MaxPool2d(2, 2), # kernel_size, stride
        nn.Conv2d(6, 16, 5),
        nn.Sigmoid(),
        nn.MaxPool2d(2, 2)
    )
batch_size=64
X = torch.randn((batch_size,1,28,28))
out=net0(X)
print(out.shape)

輸出優化

torch.Size([64, 16, 4, 4])

這就是上面咱們說的"對咱們的輸入(28 x 28)來講,卷積層獲得的輸出shape爲[batch,16,4,4]"的由來.spa

模型定義

至此,咱們能夠給出LeNet的定義:code

class LeNet(nn.Module):
    def __init__(self):
        super(LeNet, self).__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(1, 6, 5), # in_channels, out_channels, kernel_size
            nn.Sigmoid(),
            nn.MaxPool2d(2, 2), # kernel_size, stride
            nn.Conv2d(6, 16, 5),
            nn.Sigmoid(),
            nn.MaxPool2d(2, 2)
        )
        self.fc = nn.Sequential(
            nn.Linear(16*4*4, 120),
            nn.Sigmoid(),
            nn.Linear(120, 84),
            nn.Sigmoid(),
            nn.Linear(84, 10)
        )

    def forward(self, img):
        feature = self.conv(img)
        output = self.fc(feature.view(img.shape[0], -1))
        return output

forward()中,在輸入全鏈接層以前,要先feature.view(img.shape[0], -1)作一次reshape.orm

咱們用gpu來作訓練,因此要把net的參數都存儲在顯存上:htm

net = LeNet().cuda()

 數據加載

import torch
from torch import nn
import sys
sys.path.append("..") 
import learntorch_utils

batch_size,num_workers=64,4
train_iter,test_iter = learntorch_utils.load_data(batch_size,num_workers)

load_data定義於learntorch_utils.py,以下:

def load_data(batch_size,num_workers):
    mnist_train = torchvision.datasets.FashionMNIST(root='/home/sc/disk/keepgoing/learn_pytorch/Datasets/FashionMNIST',
                                                    train=True, download=True,
                                                    transform=transforms.ToTensor())
    mnist_test = torchvision.datasets.FashionMNIST(root='/home/sc/disk/keepgoing/learn_pytorch/Datasets/FashionMNIST',
                                                train=False, download=True,
                                                transform=transforms.ToTensor())

    train_iter = torch.utils.data.DataLoader(
        mnist_train, batch_size=batch_size, shuffle=True, num_workers=num_workers)
    test_iter = torch.utils.data.DataLoader(
        mnist_test, batch_size=batch_size, shuffle=False, num_workers=num_workers)
    
    return train_iter,test_iter

定義損失函數

l = nn.CrossEntropyLoss()

定義優化器

opt = torch.optim.Adam(net.parameters(),lr=0.01)

定義評估函數

def test():
    acc_sum = 0
    batch = 0
    for X,y in test_iter:
        X,y = X.cuda(),y.cuda()
        y_hat = net(X)
        acc_sum += (y_hat.argmax(dim=1) == y).float().sum().item()
        batch += 1
    print('acc:%f' % (acc_sum/(batch*batch_size)))

訓練

  • 前向傳播
  • 計算loss
  • 梯度清空,反向傳播
  • 更新參數
num_epochs=5
def train():
    for epoch in range(num_epochs):
        train_l_sum,batch=0,0
        for X,y in train_iter:
            X,y = X.cuda(),y.cuda() #把tensor放到顯存
            y_hat = net(X)  #前向傳播
            loss = l(y_hat,y) #計算loss,nn.CrossEntropyLoss中會有softmax的操做
            opt.zero_grad()#梯度清空
            loss.backward()#反向傳播,求出梯度
            opt.step()#根據梯度,更新參數

            train_l_sum += loss.item()
            batch += 1
        print('epoch %d,train_loss %f' % (epoch + 1,train_l_sum/(batch*batch_size)))
        test()

輸出以下:

epoch 1,train_loss 0.011750
acc:0.799064
epoch 2,train_loss 0.006442
acc:0.855195
epoch 3,train_loss 0.005401
acc:0.857584
epoch 4,train_loss 0.004946
acc:0.874602
epoch 5,train_loss 0.004631
acc:0.874403
相關文章
相關標籤/搜索