Deep learning：四十九(RNN-RBM簡單理解)

時間 2019-11-05

標籤 deep learning 四十九 rnn rbm 簡單理解简体版

原文原文鏈接

　　前言：html

　　本文主要是bengio的deep learning tutorial教程主頁中最後一個sample：rnn-rbm in polyphonic music. 即用RNN-RBM來model復調音樂，訓練過程當中採用的是midi格式的音頻文件，接着用建好的model來產生復調音樂。對音樂建模的難點在與每首樂曲中幀間是高度時間相關的（這樣樣本的維度會很高），用普通的網絡模型是不能搞定的（普通設計網絡模型沒有考慮時間維度，圖模型中的HMM有這方面的能力），這種狀況下能夠採用RNN來處理，這裏的RNN爲recurrent neural network中文爲循環神經網絡，另外還有一種RNN爲recursive neural network翻爲遞歸神經網絡。本文中指的是循環神經網絡。node

　　RNN簡單介紹：python

　　首先來看RNN和普通的feed-forward網絡有什麼不一樣。RNN的網絡框架以下：linux

　　由結構圖能夠知道，RNN和feed-forward相比只是中間隱含層多了一個循環的圈而已，這個圈表示上一次隱含層的輸出做爲這一次隱含層的輸入，固然此時的輸入是須要乘以一個權值矩陣的，這樣的話RNN模型參數只多了個權值矩陣。更形象的RNN圖能夠參考：算法

　　以及圖：ruby

　　按照上圖所示，可知道RNN網絡前向傳播過程當中知足下面的公式（參考文獻Learning Recurrent Neural Networks with Hessian-Free Optimization）：網絡

　　其代價函數能夠是重構的偏差：app

　　也能夠是交叉熵：框架

　　相信熟悉普通深究網絡的同窗看懂這些應該不難。dom

　　RNN-RBM簡單介紹：

　　RNN-RBM來自ICML2012的論文：Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription，它由一個單層的RBM網絡和單層的RNN網絡構成，且由RNN網絡的輸出做爲最終網絡的輸出。RBM部分當生成模型的功能，好比這裏的音樂生成，RNN部分當判別模型做用，好比它的輸出當值可當作提取的特徵。RNN-RBM模型的結構以下：

　　模型上面是RBM部分，下面是RNN部分，對應的公式能夠參考論文。模型中一共有9個參數：

　　整個模型的代價函數爲-P(v)，其中：

　　對該loss function求導，而後採用SGD算法就能夠求出模型中的各個參數了。固然了，其中的RBM部分還須要用Gibbs採樣完成CD-k算法。

　　實驗結果：

　　實驗部分參考http://deeplearning.net/tutorial/rnnrbm.html，實驗須用的數據和paper對應的見http://www-etud.iro.umontreal.ca/~boulanni/icml2012. 因爲本人對樂理方面的知識不是很懂，不少實驗代碼細節沒有去深究，只是看下算法的大概流程。由RNN-RBM生成的兩個pinao roll數據以下（程序跑了20個小時左右）：

　　迭代200次後的cost爲：

......
Epoch 197/200 -4.7050858655
Epoch 198/200 -4.69198161366
Epoch 199/200 -4.66586797348
Epoch 200/200 -4.68651185036

　　代碼以下： 　　

# Author: Nicolas Boulanger-Lewandowski
# University of Montreal (2012)
# RNN-RBM deep learning tutorial
# More information at http://deeplearning.net/tutorial/rnnrbm.html

import glob
import os
import sys

import numpy
try:
    import pylab
except ImportError:
    print "pylab isn't available, if you use their fonctionality, it will crash"
    print "It can be installed with 'pip install -q Pillow'"

from midi.utils import midiread, midiwrite
import theano
import theano.tensor as T
from theano.tensor.shared_randomstreams import RandomStreams

#Don't use a python long as this don't work on 32 bits computers.
numpy.random.seed(0xbeef)
rng = RandomStreams(seed=numpy.random.randint(1 << 30))
theano.config.warn.subtensor_merge_bug = False


#給定rbm的3個參數w,bv,bh，輸入端數據v，以及gibbs採用長度k
#返回的tuple元素依次是:v_samples(k次gibbs採用獲得的輸入端數據,01化後的),cost(rbm模型中的-log(v)),monitor(監控用變量),
#updates(保留每次迭代的中間過程，若是是shared變量的話)
def build_rbm(v, W, bv, bh, k):
    '''Construct a k-step Gibbs chain starting at v for an RBM.

v : Theano vector or matrix
  If a matrix, multiple chains will be run in parallel (batch).
W : Theano matrix
  Weight matrix of the RBM.
bv : Theano vector
  Visible bias vector of the RBM.
bh : Theano vector
  Hidden bias vector of the RBM.
k : scalar or Theano scalar
  Length of the Gibbs chain.

Return a (v_sample, cost, monitor, updates) tuple:

v_sample : Theano vector or matrix with the same shape as `v`
  Corresponds to the generated sample(s).
cost : Theano scalar
  Expression whose gradient with respect to W, bv, bh is the CD-k approximation
  to the log-likelihood of `v` (training example) under the RBM.
  The cost is averaged in the batch case.
monitor: Theano scalar
  Pseudo log-likelihood (also averaged in the batch case).
updates: dictionary of Theano variable -> Theano variable
  The `updates` object returned by scan.'''

    def gibbs_step(v): #該函數功能是一次gibbs採樣後獲得的mean_v,v
        mean_h = T.nnet.sigmoid(T.dot(v, W) + bh)
        h = rng.binomial(size=mean_h.shape, n=1, p=mean_h, #產生二項分佈,隱含層節點01化
                         dtype=theano.config.floatX)
        mean_v = T.nnet.sigmoid(T.dot(h, W.T) + bv)
        v = rng.binomial(size=mean_v.shape, n=1, p=mean_v, #反向傳播，輸入層節點也01化
                         dtype=theano.config.floatX)
        return mean_v, v #一次Gibbs採樣後輸入層01化先後的值
    #輸入的是v，輸出的是每一次Gibbs採樣後的v構成的list，一共進行k次Gibbs採樣
    chain, updates = theano.scan(lambda v: gibbs_step(v)[1], outputs_info=[v], 
                                 n_steps=k) #updates裏面裝的是每次的輸入值
    v_sample = chain[-1] #k次Gibbs採樣後輸入端的值（01化事後的）

    mean_v = gibbs_step(v_sample)[0] #再次Gibbs前進一次,獲得沒有01化的輸入端數碼，用於監控的變量
    monitor = T.xlogx.xlogy0(v, mean_v) + T.xlogx.xlogy0(1 - v, 1 - mean_v)
    monitor = monitor.sum() / v.shape[0]

    def free_energy(v): #公式4,能量的計算公式
        return -(v * bv).sum() - T.log(1 + T.exp(T.dot(v, W) + bh)).sum()
    cost = (free_energy(v) - free_energy(v_sample)) / v.shape[0] #代價函數

    return v_sample, cost, monitor, updates


def shared_normal(num_rows, num_cols, scale=1):
    '''Initialize a matrix shared variable with normally distributed
elements.'''
    return theano.shared(numpy.random.normal(
        scale=scale, size=(num_rows, num_cols)).astype(theano.config.floatX))


def shared_zeros(*shape):
    '''Initialize a vector shared variable with zero elements.'''
    return theano.shared(numpy.zeros(shape, dtype=theano.config.floatX))


def build_rnnrbm(n_visible, n_hidden, n_hidden_recurrent):
    '''Construct a symbolic RNN-RBM and initialize parameters.

n_visible : integer
  Number of visible units.
n_hidden : integer
  Number of hidden units of the conditional RBMs.
n_hidden_recurrent : integer
  Number of hidden units of the RNN.

Return a (v, v_sample, cost, monitor, params, updates_train, v_t,
          updates_generate) tuple:

v : Theano matrix
  Symbolic variable holding an input sequence (used during training)
v_sample : Theano matrix
  Symbolic variable holding the negative particles for CD log-likelihood
  gradient estimation (used during training)
cost : Theano scalar
  Expression whose gradient (considering v_sample constant) corresponds to the
  LL gradient of the RNN-RBM (used during training)
monitor : Theano scalar
  Frame-level pseudo-likelihood (useful for monitoring during training)
params : tuple of Theano shared variables
  The parameters of the model to be optimized during training.
updates_train : dictionary of Theano variable -> Theano variable
  Update object that should be passed to theano.function when compiling the
  training function.
  v_t : Theano matrix
  Symbolic variable holding a generated sequence (used during sampling)
updates_generate : dictionary of Theano variable -> Theano variable
  Update object that should be passed to theano.function when compiling the
  generation function.'''

    W = shared_normal(n_visible, n_hidden, 0.01)
    bv = shared_zeros(n_visible)
    bh = shared_zeros(n_hidden)
    Wuh = shared_normal(n_hidden_recurrent, n_hidden, 0.0001)
    Wuv = shared_normal(n_hidden_recurrent, n_visible, 0.0001)
    Wvu = shared_normal(n_visible, n_hidden_recurrent, 0.0001)
    Wuu = shared_normal(n_hidden_recurrent, n_hidden_recurrent, 0.0001)
    bu = shared_zeros(n_hidden_recurrent)

    params = W, bv, bh, Wuh, Wuv, Wvu, Wuu, bu  # learned parameters as shared
                                                # variables

    v = T.matrix()  # a training sequence
    u0 = T.zeros((n_hidden_recurrent,))  # initial value for the RNN hidden
                                         # units

    # If `v_t` is given, deterministic recurrence to compute the variable
    # biases bv_t, bh_t at each time step. If `v_t` is None, same recurrence
    # but with a separate Gibbs chain at each time step to sample (generate)
    # from the RNN-RBM. The resulting sample v_t is returned in order to be
    # passed down to the sequence history.
    # 若是給定t時刻的v和t-1時刻的u，那麼返回t時刻的u，bv,bh,含有25次Gibbs採樣過程
    # 若是隻給定t-1時刻的u（即沒有t時刻的v),則表示的是由rbm來產生v了，因此這時候返回的是t時刻的v和u，以及
    # 迭代過程當中輸入端的變換過程updates
    def recurrence(v_t, u_tm1):
        bv_t = bv + T.dot(u_tm1, Wuv)
        bh_t = bh + T.dot(u_tm1, Wuh)
        generate = v_t is None
        if generate:
            v_t, _, _, updates = build_rbm(T.zeros((n_visible,)), W, bv_t, #第一個參數應該是v,所以這裏的v是0
                                           bh_t, k=25)
        u_t = T.tanh(bu + T.dot(v_t, Wvu) + T.dot(u_tm1, Wuu))
        return ([v_t, u_t], updates) if generate else [u_t, bv_t, bh_t]

    # For training, the deterministic recurrence is used to compute all the
    # {bv_t, bh_t, 1 <= t <= T} given v. Conditional RBMs can then be trained
    # in batches using those parameters.
    (u_t, bv_t, bh_t), updates_train = theano.scan( #訓練rbm過程的符號表達式(每次只包括25步的Gibbs採樣)
        lambda v_t, u_tm1, *_: recurrence(v_t, u_tm1),
        sequences=v, outputs_info=[u0, None, None], non_sequences=params)
    v_sample, cost, monitor, updates_rbm = build_rbm(v, W, bv_t[:], bh_t[:],
                                                     k=15)
    updates_train.update(updates_rbm)

    # symbolic loop for sequence generation
    (v_t, u_t), updates_generate = theano.scan(
        lambda u_tm1, *_: recurrence(None, u_tm1),#進行generate產生過程的符號表達式，迭代200次
        outputs_info=[None, u0], non_sequences=params, n_steps=200)

    return (v, v_sample, cost, monitor, params, updates_train, v_t, #cost在build_rbm()中產生
            updates_generate)


class RnnRbm: #兩個功能，訓練RNN-RBM模型和用訓練好的RNN-RBM模型來產生樣本
    '''Simple class to train an RNN-RBM from MIDI files and to generate sample
sequences.'''

    def __init__(self, n_hidden=150, n_hidden_recurrent=100, lr=0.001, 
                 r=(21, 109), dt=0.3):
        '''Constructs and compiles Theano functions for training and sequence
generation.

n_hidden : integer
  Number of hidden units of the conditional RBMs.
n_hidden_recurrent : integer
  Number of hidden units of the RNN.
lr : float
  Learning rate
r : (integer, integer) tuple
  Specifies the pitch range of the piano-roll in MIDI note numbers, including
  r[0] but not r[1], such that r[1]-r[0] is the number of visible units of the
  RBM at a given time step. The default (21, 109) corresponds to the full range
  of piano (88 notes).
dt : float
  Sampling period when converting the MIDI files into piano-rolls, or
  equivalently the time difference between consecutive time steps.'''

        self.r = r
        self.dt = dt
        (v, v_sample, cost, monitor, params, updates_train, v_t,
         updates_generate) = build_rnnrbm(r[1] - r[0], n_hidden, #在該函數裏面有設置迭代次數等參數
                                           n_hidden_recurrent)

        gradient = T.grad(cost, params, consider_constant=[v_sample])
        updates_train.update(((p, p - lr * g) for p, g in zip(params,
                                                                gradient))) #sgd算法,利用公式4的cost公式搞定8個參數的更新
        self.train_function = theano.function([v], monitor,
                                               updates=updates_train)
        self.generate_function = theano.function([], v_t, #updates_generate步驟在build_rnnrbm()中產生，音樂的產生主要在那函數中
                                                 updates=updates_generate)

    def train(self, files, batch_size=100, num_epochs=200):
        '''Train the RNN-RBM via stochastic gradient descent (SGD) using MIDI
files converted to piano-rolls.

files : list of strings
  List of MIDI files that will be loaded as piano-rolls for training.
batch_size : integer
  Training sequences will be split into subsequences of at most this size
  before applying the SGD updates.
num_epochs : integer
  Number of epochs (pass over the training set) performed. The user can
  safely interrupt training with Ctrl+C at any time.'''

        assert len(files) > 0, 'Training set is empty!' \
                               ' (did you download the data files?)'
        dataset = [midiread(f, self.r,
                            self.dt).piano_roll.astype(theano.config.floatX)
                   for f in files] #讀取midi文件

        try:
            for epoch in xrange(num_epochs): #訓練200次
                numpy.random.shuffle(dataset) #將訓練樣本打亂
                costs = []

                for s, sequence in enumerate(dataset): #返回的s是序號，sequence是dataset對應序號下的值
                    for i in xrange(0, len(sequence), batch_size):
                        cost = self.train_function(sequence[i:i + batch_size]) #train_function在init()函數中
                        costs.append(cost)

                print 'Epoch %i/%i' % (epoch + 1, num_epochs),
                print numpy.mean(costs) 
                sys.stdout.flush()

        except KeyboardInterrupt:
            print 'Interrupted by user.'

    def generate(self, filename, show=True):
        '''Generate a sample sequence, plot the resulting piano-roll and save
it as a MIDI file.

filename : string
  A MIDI file will be created at this location.
show : boolean
  If True, a piano-roll of the generated sequence will be shown.'''

        piano_roll = self.generate_function() #直接生成piano roll文件
        midiwrite(filename, piano_roll, self.r, self.dt)#將piano_roll文件轉換成midi文件並保存
        if show:
            extent = (0, self.dt * len(piano_roll)) + self.r
            pylab.figure()
            pylab.imshow(piano_roll.T, origin='lower', aspect='auto',
                         interpolation='nearest', cmap=pylab.cm.gray_r,
                         extent=extent)
            pylab.xlabel('time (s)')
            pylab.ylabel('MIDI note number')
            pylab.title('generated piano-roll')


def test_rnnrbm(batch_size=100, num_epochs=200):
    model = RnnRbm()
    #os.path.dirname(__file__)爲得到當前文件的目錄,os.path.split(path)是將path按照最後一個斜線分紅父和子的部分
    re = os.path.join(os.path.split(os.path.dirname(__file__))[0], #該代碼完成的功能是，找到當前文件的上級目錄下的/data/Nottinghan/train/*.mid文件
                      'data', 'Nottingham', 'train', '*.mid') #re獲得該目錄下的全部.mid文件
    model.train(glob.glob(re),#glob.glob()只是將文件路徑名等弄成linux的格式
                batch_size=batch_size, num_epochs=num_epochs)
    return model

if __name__ == '__main__':
    model = test_rnnrbm() #該函數主要用來訓練RNN-RBM參數
    model.generate('sample1.mid') #產生數據的v_t初始化都是0
    model.generate('sample2.mid')
    pylab.show()

　　實驗總結：

　　關於bp算法：因爲RNN-RBM中對loss函數求導用到了BPTT(back propgation through time)算法：BP算法加入了時間維度。爲了加深對BP算法的理解，從新看了一遍推導過程。bp算法的推導過程是主要是由求導中的鏈式法則獲得的。具體算法可參考Martin T.Hagan 的《神經網絡設計》第11章（這本書寫得不錯，翻譯得也還能夠）。其思想大概爲：損失函數F對第m層wij(鏈接第m層第i個節點和第m-1層第j個節點之間的權值)的導數等於F對第m層第i個節點輸入值的導數，乘上該輸入值對wij的導數(很容易知道這個導數等於第m-1層第j個節點的輸出值)。而F對第m層第i個節點輸入值的導數值又等於F對第m+1層輸入值的導數（這時須要考慮第m+1中全部的節點）乘以第m+1層輸入值對第m層第i個輸入值的導數（這個導數值很容易由激發函數的導函數求得）,而且咱們一般說的bp算法是偏差方向傳播，這裏的第m層偏差指的就是F對第m層輸入值的導數。由此可知，能夠從最後一層依次往前求解，這就是bp算法的思想，本質上是高數裏面的鏈式求導法則。

另外，實驗中關於樂理對應的具體細節沒有深究。

　　參考資料：

http://deeplearning.net/tutorial/rnnrbm.html（教程主頁）

《神經網絡設計》，Martin T.Hagan.

http://www.cse.unsw.edu.au/~waleed/phd/html/node37.html(RNN圖片來源1)

Recurrent Neural Networks in Ruby.（RNN圖片來源2）

　　Learning Recurrent Neural Networks with Hessian-Free Optimization, James Martens，Ilya Sutskever.

　　Modeling Temporal Dependencies in High-Dimensional Sequences: Application to Polyphonic Music Generation and Transcription, Nicolas Boulanger-Lewandowski，Yoshua Bengio，Pascal Vincent.

http://www-etud.iro.umontreal.ca/~boulanni/icml2012(rnn-rbm項目主頁)

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。