用Theano學習Deep Learning（三）：卷積神經網絡

時間 2019-12-10

標籤 theano 學習 deep learning 神經網絡简体版

原文原文鏈接

寫在前面的廢話：python

出了託福成績啦，本人戰戰兢兢考了個97！成績好的出乎意料！喜大普奔！撒花慶祝！git

傻…………寒假還要怒學一個月刷100慶祝個毛線…………網絡

正題：app

題目是CNN，可是CNN的具體原理和以後會寫一篇博客在deeplearning目錄下詳細說明。less

簡單地說，CNN與NN相比獨特之處在於用部分鏈接代替全連接，並用pooling來對數據進行降維，這樣作有幾個好處：dom

1. 對於大圖像來講所需訓練的參數大大減小
2. 獲取圖像的部分特徵而非全局特徵
3. pooling使得網絡的輸出結果具備必定的平移和遮擋不變性
4. demo見：（效果仍是挺好的，當年華爾街銀行用來讀支票）

這裏主要說代碼。編輯器

一、類：LeNetConvPoolLayeride

- 包括了一次卷積和一次pooling，一共兩層。
- 初始化參數輸入數據，輸入圖片大小，卷積核大小，池化大小
- 池化並不使用平均值，而是使用最大值做爲輸出
- 中間參數有卷積核W，偏置b，卷積輸出和偏置輸出，總體輸出=tanh（池化輸出+偏置）
- W和b合併成一個列表params

二、類：evaluate_lenet5函數

- 包括了兩個LeNetConvPoolLayer（Layer0，1）和兩層神經網絡（Layer2，3）
- 第一層神經節點用類：HiddentLayer，第二層用類：OutputLayer（MLP中的內容，之後補）
- test_model和validate_model：輸入一個樣本，輸出與label的偏差
- 四層的函數並在一塊兒：params = layer3.params + layer2.params + layer1.params + layer0.params（能夠這樣？沒見過），用grads = T.grad(cost, params)求偏導，好方便。
- train_model中用update功能更新參數（更快，update表用for循環構建）

用到的兩個類大概就是這個樣子。oop

訓練過程當中的要點：

兩層循環，一層逐個樣本訓練，參數minibatch_index；一層循環訓練總樣本，參數epoch；iter表示已經學習次數
參數patience表示最大iter數，初始化維10000，若在評價中發現訓練表現良好則翻倍
每到validation_frequency則評價一次，若當前偏差比最好偏差好0.995則翻倍patience
iter>=patience || epochs>=n_epoch 則中止訓練

訓練過程大概就是這個樣子。

一點感想：

此次一段代碼看下來，對python的class有了更深的理解。
就目前的理解，第一次調用class，class會自動初始化裏面的參數；
之後每次調用class的函數，class都會自動從頭跑一次，更新裏面的參數並輸出給function
因此一個class is better than c裏面的一個function（由於c裏面只能計算，而python裏面把結構搭建起來了並且保存參數）

Theano.tensor下的shape[]和dimshuffle[]具體用法還不懂
另外這個代碼下多處用到了for循環，matlab裏面是很忌諱for的。爲何這裏卻很經常使用，反而少見矩陣運算了？
validation_losses = [validate_model(i) for i in xrange(n_valid_batches)] 用法很高級
params = layer3.params + layer2.params + layer1.params + layer0.params 是合併表的意思？
用update來更新參數，快準狠！

下面是本身本身一行一行讀代碼寫並寫上的中文註釋。（cnblog太窄複製到文本編輯器看吧，推薦sublime）

This implementation simplifies the model in the following ways:

 - LeNetConvPool doesn't implement location-specific gain and bias parameters
 - LeNetConvPool doesn't implement pooling by average, it implements pooling
   by max.
 - Digit classification is implemented with a logistic regression rather than
   an RBF network
 - LeNet5 was not fully-connected convolutions at second layer

"""
import cPickle
import gzip
import os
import sys
import time

import numpy

import theano
import theano.tensor as T
from theano.tensor.signal import downsample
from theano.tensor.nnet import conv

from logistic_sgd import LogisticRegression, load_data
from mlp import HiddenLayer


class LeNetConvPoolLayer(object):
    """Pool Layer of a convolutional network """

    def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
        """
        Allocate a LeNetConvPoolLayer with shared variable internal parameters.

        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.dtensor4
        :param input: symbolic image tensor, of shape image_shape

        :type filter_shape: tuple or list of length 4
        :param filter_shape: (number of filters, num input feature maps,
                              filter height,filter width)

        :type image_shape: tuple or list of length 4
        :param image_shape: (batch size, num input feature maps,
                             image height, image width)

        :type poolsize: tuple or list of length 2
        :param poolsize: the downsampling (pooling) factor (#rows,#cols)
        """

        assert image_shape[1] == filter_shape[1]
        self.input = input

        # there are "num input feature maps * filter height * filter width"
        # inputs to each hidden unit
        fan_in = numpy.prod(filter_shape[1:])
        # each unit in the lower layer receives a gradient from:
        # "num output feature maps * filter height * filter width" /
        #   pooling size
        fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /
                   numpy.prod(poolsize))
        # initialize weights with random weights
        W_bound = numpy.sqrt(6. / (fan_in + fan_out))
        self.W = theano.shared(numpy.asarray(
            rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
            dtype=theano.config.floatX),
                               borrow=True)

        # the bias is a 1D tensor -- one bias per output feature map
        b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)          
        self.b = theano.shared(value=b_values, borrow=True)

        # convolve input feature maps with filters
        conv_out = conv.conv2d(input=input, filters=self.W,                             #卷積函數，用W卷積不加偏置
                filter_shape=filter_shape, image_shape=image_shape)

        # downsample each feature map individually, using maxpooling
        pooled_out = downsample.max_pool_2d(input=conv_out,                             #pooling，用max不用mean，不重疊
                                            ds=poolsize, ignore_border=True)

        # add the bias term. Since the bias is a vector (1D array), we first
        # reshape it to a tensor of shape (1,n_filters,1,1). Each bias will
        # thus be broadcasted across mini-batches and feature map
        # width & height
        self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))          #卷積層池化後加上偏置用tanh輸出，dimshuffle()將向量整形爲矩陣，具體不懂

        # store parameters of this layer
        self.params = [self.W, self.b]                                                  #卷積核+偏置併爲參數

  #學習率=0.1， 學習次數=200， nkerns=[20,50]表示第一層20個核，第二層50個核； 補丁大小：500？？？？
def evaluate_lenet5(learning_rate=0.1, n_epochs=200,                                  
                    dataset='../data/mnist.pkl.gz',
                    nkerns=[20, 50], batch_size=500):
    """ Demonstrates lenet on MNIST datasets

    :type learning_rate: float
    :param learning_rate: learning rate used (factor for the stochastic
                          gradient)

    :type n_epochs: int
    :param n_epochs: maximal number of epochs to run the optimizer

    :type dataset: string
    :param dataset: path to the dataset used for training /testing (MNIST here)

    :type nkerns: list of ints
    :param nkerns: number of kernels on each layer
    """

    rng = numpy.random.RandomState(23455)                                               #隨機數作種

    datasets = load_data(dataset)                                                       #讀入數據

    train_set_x, train_set_y = datasets[0]                                              #傳遞三部分數據（解包）
    valid_set_x, valid_set_y = datasets[1]
    test_set_x, test_set_y = datasets[2]

    # compute number of minibatches for training, validation and testing                #表示數據能夠借用提升GPU運算速率，shape[0],做用爲止
    n_train_batches = train_set_x.get_value(borrow=True).shape[0]
    n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]
    n_test_batches = test_set_x.get_value(borrow=True).shape[0]
    n_train_batches /= batch_size                                                       #樣本總數量
    n_valid_batches /= batch_size
    n_test_batches /= batch_size

    # allocate symbolic variables for the data
    index = T.lscalar()  # index to a [mini]batch                                       #當前batch的下標
    x = T.matrix('x')   # the data is presented as rasterized images                    #當前batch
    y = T.ivector('y')  # the labels are presented as 1D vector of                      #當前batch的標籤
                        # [int] labels

    ishape = (28, 28)  # this is the size of MNIST images

    ######################
    # BUILD ACTUAL MODEL #
    ######################
    print '... building the model'

    # Reshape matrix of rasterized images of shape (batch_size,28*28)
    # to a 4D tensor, compatible with our LeNetConvPoolLayer
    layer0_input = x.reshape((batch_size, 1, 28, 28))                                   #input是reshape的x 

    # Construct the first convolutional pooling layer:
    # filtering reduces the image size to (28-5+1,28-5+1)=(24,24)
    # maxpooling reduces this further to (24/2,24/2) = (12,12)
    # 4D output tensor is thus of shape (batch_size,nkerns[0],12,12)
    #初始化第一個卷積池化layer，input = layer0_input
    layer0 = LeNetConvPoolLayer(rng, input=layer0_input,
            image_shape=(batch_size, 1, 28, 28),
            filter_shape=(nkerns[0], 1, 5, 5), poolsize=(2, 2))

    # Construct the second convolutional pooling layer
    # filtering reduces the image size to (12-5+1,12-5+1)=(8,8)
    # maxpooling reduces this further to (8/2,8/2) = (4,4)
    # 4D output tensor is thus of shape (nkerns[0],nkerns[1],4,4)
    #初始化第二個卷積池化layer , input = layer0_output
    layer1 = LeNetConvPoolLayer(rng, input=layer0.output,
            image_shape=(batch_size, nkerns[0], 12, 12),
            filter_shape=(nkerns[1], nkerns[0], 5, 5), poolsize=(2, 2))

    # the TanhLayer being fully-connected, it operates on 2D matrices of
    # shape (batch_size,num_pixels) (i.e matrix of rasterized images).
    # This will generate a matrix of shape (20,32*4*4) = (20,512)
    #layer2是第一層全鏈接層，拉平後的池化層做爲輸入
    layer2_input = layer1.output.flatten(2)

    # construct a fully-connected sigmoidal layer
    # 用隱藏層的類表示
    layer2 = HiddenLayer(rng, input=layer2_input, n_in=nkerns[1] * 4 * 4,
                         n_out=500, activation=T.tanh)

    # classify the values of the fully-connected sigmoidal layer
    # 輸出是邏輯迴歸層
    layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)

    # the cost we minimize during training is the NLL of the model
    # 代價函數值用negative_log_likelihood來算，（自帶的？）
    cost = layer3.negative_log_likelihood(y)

    # create a function to compute the mistakes that are made by the model
    # 定義一個函數，計算輸出層的偏差，用givens來覆蓋全局變量
    test_model = theano.function([index], layer3.errors(y),
             givens={
                x: test_set_x[index * batch_size: (index + 1) * batch_size],
                y: test_set_y[index * batch_size: (index + 1) * batch_size]})

    ## 同上定義一個函數，計算輸出層的偏差，用givens來覆蓋全局變量
    validate_model = theano.function([index], layer3.errors(y),
            givens={
                x: valid_set_x[index * batch_size: (index + 1) * batch_size],
                y: valid_set_y[index * batch_size: (index + 1) * batch_size]})

    # create a list of all model parameters to be fit by gradient descent
    # 各層參數合併
    params = layer3.params + layer2.params + layer1.params + layer0.params

    # create a list of gradients for all model parameters
    # 利用自帶的函數計算各參數的偏導
    grads = T.grad(cost, params)

    # train_model is a function that updates the model parameters by
    # SGD Since this model has many parameters, it would be tedious to
    # manually create an update rule for each model parameter. We thus
    # create the updates list by automatically looping over all
    # (params[i],grads[i]) pairs.
    # 更新參數十分麻煩， 建立一個叫作updates的list來自動更新（？爲何要用for，這樣不會很慢嗎？——墳蛋這不是matlab！）
    updates = []
    for param_i, grad_i in zip(params, grads):
        updates.append((param_i, param_i - learning_rate * grad_i))

    # 定義訓練函數，輸出cost並用update 的方法更新參數
    train_model = theano.function([index], cost, updates=updates,
          givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]})

    ###############
    # TRAIN MODEL #
    ###############
    print '... training'
    # early-stopping parameters                                          
    patience = 10000  # look as this many examples regardless 
    patience_increase = 2  # wait this much longer when a new best is  若是訓練偏差良好的話訓練的次數變爲兩倍
                           # found
    improvement_threshold = 0.995  # a relative improvement of this much is 若是偏差小於上一次偏差的0.995，patience increase
                                   # considered significant
    validation_frequency = min(n_train_batches, patience / 2)  #評價訓練效果的頻率，這個數值爲何這麼取我不清楚
                                  # go through this manually
                                  # minibatche before checking the network
                                  # on the validation set; in this case we
                                  # check every epoch

    best_params = None
    best_validation_loss = numpy.inf
    best_iter = 0
    test_score = 0.
    start_time = time.clock()

    epoch = 0
    done_looping = False

    while (epoch < n_epochs) and (not done_looping):                        #整體樣本訓練次數
        epoch = epoch + 1
        for minibatch_index in xrange(n_train_batches):                     #逐個樣本訓練

            iter = (epoch - 1) * n_train_batches + minibatch_index          #到目前爲止總的訓練次數

            if iter % 100 == 0:                                             #每訓練100次輸出一個提示，提示訓練次數
                print 'training @ iter = ', iter
            cost_ij = train_model(minibatch_index)                          #訓練一次

            if (iter + 1) % validation_frequency == 0:                      #到達須要進行一次評價的次數，對學習結果進行評價

                # compute zero-one loss on validation set                   #利用for循環和validation_modle(index)返回全部評價樣本的偏差值並構造一個表
                validation_losses = [validate_model(i) for i
                                     in xrange(n_valid_batches)]
                this_validation_loss = numpy.mean(validation_losses)        #當前偏差值=當前平均
                print('epoch %i, minibatch %i/%i, validation error %f %%' % \
                      (epoch, minibatch_index + 1, n_train_batches, \
                       this_validation_loss * 100.))

                # if we got the best validation score until now
                if this_validation_loss < best_validation_loss:             #若是當 前平均偏差<(最好偏差*閥值)，證實參數還有很大的優化空間，加倍訓練次數

                    #improve patience if loss improvement is good enough
                    if this_validation_loss < best_validation_loss *  \
                       improvement_threshold:
                        patience = max(patience, iter * patience_increase)

                    # save best validation score and iteration number
                    best_validation_loss = this_validation_loss
                    best_iter = iter

                    # test it on the test set
                    test_losses = [test_model(i) for i in xrange(n_test_batches)]  #用測試樣本對模型參數進行評價
                    test_score = numpy.mean(test_losses)                           #這裏有個tip：應爲參數使用train集合訓練使用validation集合進行評價；
                    print(('     epoch %i, minibatch %i/%i, test error of best '   #因此參數的擬合是會偏向那兩個集合的特徵的，因此要是用全新的集合來獲得參數的客觀表現
                           'model %f %%') %                                        #在各類訓練中，樣本都要分爲訓練樣本、評價(擬合)樣本和測試樣本進行使用，比例大概是6:2:2，這裏是 5:1:1
                          (epoch, minibatch_index + 1, n_train_batches,
                           test_score * 100.))

            if patience <= iter:                                               #若是沒耐性了（到達最大訓練次數），就中止訓練
                done_looping = True
                break
    #下面就是計時啊評價啊什麼什麼的
    end_time = time.clock()
    print('Optimization complete.')
    print('Best validation score of %f %% obtained at iteration %i,'\
          'with test performance %f %%' %
          (best_validation_loss * 100., best_iter + 1, test_score * 100.))
    print >> sys.stderr, ('The code for file ' +
                          os.path.split(__file__)[1] +
                          ' ran for %.2fm' % ((end_time - start_time) / 60.))

if __name__ == '__main__':
    evaluate_lenet5()


def experiment(state, channel):
    evaluate_lenet5(state.learning_rate, dataset=state.dataset)

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。