用Theano學習Deep Learning(三):卷積神經網絡

寫在前面的廢話:python

出了託福成績啦,本人戰戰兢兢考了個97!成績好的出乎意料!喜大普奔!撒花慶祝!git

傻…………寒假還要怒學一個月刷100慶祝個毛線…………網絡


 

正題:app

題目是CNN,可是CNN的具體原理和以後會寫一篇博客在deeplearning目錄下詳細說明。less

簡單地說,CNN與NN相比獨特之處在於用部分鏈接代替全連接,並用pooling來對數據進行降維,這樣作有幾個好處:dom

    1. 對於大圖像來講所需訓練的參數大大減小
    2. 獲取圖像的部分特徵而非全局特徵
    3. pooling使得網絡的輸出結果具備必定的平移和遮擋不變性
    4. demo見:(效果仍是挺好的,當年華爾街銀行用來讀支票)

這裏主要說代碼。編輯器

一、類:LeNetConvPoolLayeride

    • 包括了一次卷積和一次pooling,一共兩層。
    • 初始化參數輸入數據,輸入圖片大小,卷積核大小,池化大小
    • 池化並不使用平均值,而是使用最大值做爲輸出
    • 中間參數有卷積核W,偏置b,卷積輸出和偏置輸出,總體輸出=tanh(池化輸出+偏置)
    • W和b合併成一個列表params

二、類:evaluate_lenet5函數

    • 包括了兩個LeNetConvPoolLayer(Layer0,1)和兩層神經網絡(Layer2,3)
    • 第一層神經節點用類:HiddentLayer,第二層用類:OutputLayer(MLP中的內容,之後補)
    • test_model和validate_model:輸入一個樣本,輸出與label的偏差
    • 四層的函數並在一塊兒:params = layer3.params + layer2.params + layer1.params + layer0.params(能夠這樣?沒見過),用grads = T.grad(cost, params)求偏導,好方便。
    • train_model中用update功能更新參數(更快,update表用for循環構建)

用到的兩個類大概就是這個樣子。oop


 

訓練過程當中的要點:

  • 兩層循環,一層逐個樣本訓練,參數minibatch_index;一層循環訓練總樣本,參數epoch;iter表示已經學習次數
  • 參數patience表示最大iter數,初始化維10000,若在評價中發現訓練表現良好則翻倍
  • 每到validation_frequency則評價一次,若當前偏差比最好偏差好0.995則翻倍patience
  • iter>=patience || epochs>=n_epoch 則中止訓練

訓練過程大概就是這個樣子。

 


 

一點感想:

  • 此次一段代碼看下來,對python的class有了更深的理解。
  • 就目前的理解,第一次調用class,class會自動初始化裏面的參數;
  • 之後每次調用class的函數,class都會自動從頭跑一次,更新裏面的參數並輸出給function
  • 因此一個class is better than c裏面的一個function(由於c裏面只能計算,而python裏面把結構搭建起來了並且保存參數)
  • Theano.tensor下的shape[]和dimshuffle[]具體用法還不懂
  • 另外這個代碼下多處用到了for循環,matlab裏面是很忌諱for的。爲何這裏卻很經常使用,反而少見矩陣運算了?
  • validation_losses = [validate_model(i) for i in xrange(n_valid_batches)]  用法很高級

  • params = layer3.params + layer2.params + layer1.params + layer0.params 是合併表的意思?
  • 用update來更新參數,快準狠!

 


 

 下面是本身本身一行一行讀代碼寫並寫上的中文註釋。(cnblog太窄複製到文本編輯器看吧,推薦sublime)

This implementation simplifies the model in the following ways:

 - LeNetConvPool doesn't implement location-specific gain and bias parameters
 - LeNetConvPool doesn't implement pooling by average, it implements pooling
   by max.
 - Digit classification is implemented with a logistic regression rather than
   an RBF network
 - LeNet5 was not fully-connected convolutions at second layer

"""
import cPickle
import gzip
import os
import sys
import time

import numpy

import theano
import theano.tensor as T
from theano.tensor.signal import downsample
from theano.tensor.nnet import conv

from logistic_sgd import LogisticRegression, load_data
from mlp import HiddenLayer


class LeNetConvPoolLayer(object):
    """Pool Layer of a convolutional network """

    def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
        """
        Allocate a LeNetConvPoolLayer with shared variable internal parameters.

        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.dtensor4
        :param input: symbolic image tensor, of shape image_shape

        :type filter_shape: tuple or list of length 4
        :param filter_shape: (number of filters, num input feature maps,
                              filter height,filter width)

        :type image_shape: tuple or list of length 4
        :param image_shape: (batch size, num input feature maps,
                             image height, image width)

        :type poolsize: tuple or list of length 2
        :param poolsize: the downsampling (pooling) factor (#rows,#cols)
        """

        assert image_shape[1] == filter_shape[1]
        self.input = input

        # there are "num input feature maps * filter height * filter width"
        # inputs to each hidden unit
        fan_in = numpy.prod(filter_shape[1:])
        # each unit in the lower layer receives a gradient from:
        # "num output feature maps * filter height * filter width" /
        #   pooling size
        fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /
                   numpy.prod(poolsize))
        # initialize weights with random weights
        W_bound = numpy.sqrt(6. / (fan_in + fan_out))
        self.W = theano.shared(numpy.asarray(
            rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
            dtype=theano.config.floatX),
                               borrow=True)

        # the bias is a 1D tensor -- one bias per output feature map
        b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)          
        self.b = theano.shared(value=b_values, borrow=True)

        # convolve input feature maps with filters
        conv_out = conv.conv2d(input=input, filters=self.W,                             #卷積函數,用W卷積不加偏置
                filter_shape=filter_shape, image_shape=image_shape)

        # downsample each feature map individually, using maxpooling
        pooled_out = downsample.max_pool_2d(input=conv_out,                             #pooling,用max不用mean,不重疊
                                            ds=poolsize, ignore_border=True)

        # add the bias term. Since the bias is a vector (1D array), we first
        # reshape it to a tensor of shape (1,n_filters,1,1). Each bias will
        # thus be broadcasted across mini-batches and feature map
        # width & height
        self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))          #卷積層池化後加上偏置用tanh輸出,dimshuffle()將向量整形爲矩陣,具體不懂

        # store parameters of this layer
        self.params = [self.W, self.b]                                                  #卷積核+偏置併爲參數

  #學習率=0.1, 學習次數=200, nkerns=[20,50]表示第一層20個核,第二層50個核; 補丁大小:500????
def evaluate_lenet5(learning_rate=0.1, n_epochs=200,                                  
                    dataset='../data/mnist.pkl.gz',
                    nkerns=[20, 50], batch_size=500):
    """ Demonstrates lenet on MNIST datasets

    :type learning_rate: float
    :param learning_rate: learning rate used (factor for the stochastic
                          gradient)

    :type n_epochs: int
    :param n_epochs: maximal number of epochs to run the optimizer

    :type dataset: string
    :param dataset: path to the dataset used for training /testing (MNIST here)

    :type nkerns: list of ints
    :param nkerns: number of kernels on each layer
    """

    rng = numpy.random.RandomState(23455)                                               #隨機數作種

    datasets = load_data(dataset)                                                       #讀入數據

    train_set_x, train_set_y = datasets[0]                                              #傳遞三部分數據(解包)
    valid_set_x, valid_set_y = datasets[1]
    test_set_x, test_set_y = datasets[2]

    # compute number of minibatches for training, validation and testing                #表示數據能夠借用提升GPU運算速率,shape[0],做用爲止
    n_train_batches = train_set_x.get_value(borrow=True).shape[0]
    n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]
    n_test_batches = test_set_x.get_value(borrow=True).shape[0]
    n_train_batches /= batch_size                                                       #樣本總數量
    n_valid_batches /= batch_size
    n_test_batches /= batch_size

    # allocate symbolic variables for the data
    index = T.lscalar()  # index to a [mini]batch                                       #當前batch的下標
    x = T.matrix('x')   # the data is presented as rasterized images                    #當前batch
    y = T.ivector('y')  # the labels are presented as 1D vector of                      #當前batch的標籤
                        # [int] labels

    ishape = (28, 28)  # this is the size of MNIST images

    ######################
    # BUILD ACTUAL MODEL #
    ######################
    print '... building the model'

    # Reshape matrix of rasterized images of shape (batch_size,28*28)
    # to a 4D tensor, compatible with our LeNetConvPoolLayer
    layer0_input = x.reshape((batch_size, 1, 28, 28))                                   #input是reshape的x 

    # Construct the first convolutional pooling layer:
    # filtering reduces the image size to (28-5+1,28-5+1)=(24,24)
    # maxpooling reduces this further to (24/2,24/2) = (12,12)
    # 4D output tensor is thus of shape (batch_size,nkerns[0],12,12)
    #初始化第一個卷積池化layer,input = layer0_input
    layer0 = LeNetConvPoolLayer(rng, input=layer0_input,
            image_shape=(batch_size, 1, 28, 28),
            filter_shape=(nkerns[0], 1, 5, 5), poolsize=(2, 2))

    # Construct the second convolutional pooling layer
    # filtering reduces the image size to (12-5+1,12-5+1)=(8,8)
    # maxpooling reduces this further to (8/2,8/2) = (4,4)
    # 4D output tensor is thus of shape (nkerns[0],nkerns[1],4,4)
    #初始化第二個卷積池化layer , input = layer0_output
    layer1 = LeNetConvPoolLayer(rng, input=layer0.output,
            image_shape=(batch_size, nkerns[0], 12, 12),
            filter_shape=(nkerns[1], nkerns[0], 5, 5), poolsize=(2, 2))

    # the TanhLayer being fully-connected, it operates on 2D matrices of
    # shape (batch_size,num_pixels) (i.e matrix of rasterized images).
    # This will generate a matrix of shape (20,32*4*4) = (20,512)
    #layer2是第一層全鏈接層,拉平後的池化層做爲輸入
    layer2_input = layer1.output.flatten(2)

    # construct a fully-connected sigmoidal layer
    # 用隱藏層的類表示
    layer2 = HiddenLayer(rng, input=layer2_input, n_in=nkerns[1] * 4 * 4,
                         n_out=500, activation=T.tanh)

    # classify the values of the fully-connected sigmoidal layer
    # 輸出是邏輯迴歸層
    layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)

    # the cost we minimize during training is the NLL of the model
    # 代價函數值用negative_log_likelihood來算,(自帶的?)
    cost = layer3.negative_log_likelihood(y)

    # create a function to compute the mistakes that are made by the model
    # 定義一個函數,計算輸出層的偏差,用givens來覆蓋全局變量
    test_model = theano.function([index], layer3.errors(y),
             givens={
                x: test_set_x[index * batch_size: (index + 1) * batch_size],
                y: test_set_y[index * batch_size: (index + 1) * batch_size]})

    ## 同上定義一個函數,計算輸出層的偏差,用givens來覆蓋全局變量
    validate_model = theano.function([index], layer3.errors(y),
            givens={
                x: valid_set_x[index * batch_size: (index + 1) * batch_size],
                y: valid_set_y[index * batch_size: (index + 1) * batch_size]})

    # create a list of all model parameters to be fit by gradient descent
    # 各層參數合併
    params = layer3.params + layer2.params + layer1.params + layer0.params

    # create a list of gradients for all model parameters
    # 利用自帶的函數計算各參數的偏導
    grads = T.grad(cost, params)

    # train_model is a function that updates the model parameters by
    # SGD Since this model has many parameters, it would be tedious to
    # manually create an update rule for each model parameter. We thus
    # create the updates list by automatically looping over all
    # (params[i],grads[i]) pairs.
    # 更新參數十分麻煩, 建立一個叫作updates的list來自動更新(?爲何要用for,這樣不會很慢嗎?——墳蛋這不是matlab!)
    updates = []
    for param_i, grad_i in zip(params, grads):
        updates.append((param_i, param_i - learning_rate * grad_i))

    # 定義訓練函數,輸出cost並用update 的方法更新參數
    train_model = theano.function([index], cost, updates=updates,
          givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]})

    ###############
    # TRAIN MODEL #
    ###############
    print '... training'
    # early-stopping parameters                                          
    patience = 10000  # look as this many examples regardless 
    patience_increase = 2  # wait this much longer when a new best is  若是訓練偏差良好的話訓練的次數變爲兩倍
                           # found
    improvement_threshold = 0.995  # a relative improvement of this much is 若是偏差小於上一次偏差的0.995,patience increase
                                   # considered significant
    validation_frequency = min(n_train_batches, patience / 2)  #評價訓練效果的頻率,這個數值爲何這麼取我不清楚
                                  # go through this manually
                                  # minibatche before checking the network
                                  # on the validation set; in this case we
                                  # check every epoch

    best_params = None
    best_validation_loss = numpy.inf
    best_iter = 0
    test_score = 0.
    start_time = time.clock()

    epoch = 0
    done_looping = False

    while (epoch < n_epochs) and (not done_looping):                        #整體樣本訓練次數
        epoch = epoch + 1
        for minibatch_index in xrange(n_train_batches):                     #逐個樣本訓練

            iter = (epoch - 1) * n_train_batches + minibatch_index          #到目前爲止總的訓練次數

            if iter % 100 == 0:                                             #每訓練100次輸出一個提示,提示訓練次數
                print 'training @ iter = ', iter
            cost_ij = train_model(minibatch_index)                          #訓練一次

            if (iter + 1) % validation_frequency == 0:                      #到達須要進行一次評價的次數,對學習結果進行評價

                # compute zero-one loss on validation set                   #利用for循環和validation_modle(index)返回全部評價樣本的偏差值並構造一個表
                validation_losses = [validate_model(i) for i
                                     in xrange(n_valid_batches)]
                this_validation_loss = numpy.mean(validation_losses)        #當前偏差值=當前平均
                print('epoch %i, minibatch %i/%i, validation error %f %%' % \
                      (epoch, minibatch_index + 1, n_train_batches, \
                       this_validation_loss * 100.))

                # if we got the best validation score until now
                if this_validation_loss < best_validation_loss:             #若是當 前平均偏差<(最好偏差*閥值),證實參數還有很大的優化空間,加倍訓練次數

                    #improve patience if loss improvement is good enough
                    if this_validation_loss < best_validation_loss *  \
                       improvement_threshold:
                        patience = max(patience, iter * patience_increase)

                    # save best validation score and iteration number
                    best_validation_loss = this_validation_loss
                    best_iter = iter

                    # test it on the test set
                    test_losses = [test_model(i) for i in xrange(n_test_batches)]  #用測試樣本對模型參數進行評價
                    test_score = numpy.mean(test_losses)                           #這裏有個tip:應爲參數使用train集合訓練使用validation集合進行評價;
                    print(('     epoch %i, minibatch %i/%i, test error of best '   #因此參數的擬合是會偏向那兩個集合的特徵的,因此要是用全新的集合來獲得參數的客觀表現
                           'model %f %%') %                                        #在各類訓練中,樣本都要分爲訓練樣本、評價(擬合)樣本和測試樣本進行使用,比例大概是6:2:2,這裏是 5:1:1
                          (epoch, minibatch_index + 1, n_train_batches,
                           test_score * 100.))

            if patience <= iter:                                               #若是沒耐性了(到達最大訓練次數),就中止訓練
                done_looping = True
                break
    #下面就是計時啊評價啊什麼什麼的
    end_time = time.clock()
    print('Optimization complete.')
    print('Best validation score of %f %% obtained at iteration %i,'\
          'with test performance %f %%' %
          (best_validation_loss * 100., best_iter + 1, test_score * 100.))
    print >> sys.stderr, ('The code for file ' +
                          os.path.split(__file__)[1] +
                          ' ran for %.2fm' % ((end_time - start_time) / 60.))

if __name__ == '__main__':
    evaluate_lenet5()


def experiment(state, channel):
    evaluate_lenet5(state.learning_rate, dataset=state.dataset)
相關文章
相關標籤/搜索