CNN中卷積完後有個步驟叫pooling, 在ICLR2013上,做者Zeiler提出了另外一種pooling手段(最多見的就是mean-pooling和max-pooling),叫stochastic pooling,在他的文章還給出了效果稍差點的probability weighted pooling方法。git
stochastic pooling方法很是簡單,只需對feature map中的元素按照其機率值大小隨機選擇,即元素值大的被選中的機率也大。而不像max-pooling那樣,永遠只取那個最大值元素。github
假設feature map中的pooling區域元素值以下:windows
3*3大小的,元素值和sum=0+1.1+2.5+0.9+2.0+1.0+0+1.5+1.0=10網絡
方格中的元素同時除以sum後獲得的矩陣元素爲:app
每一個元素值表示對應位置處值的機率,如今只須要按照該機率來隨機選一個,方法是:將其看做是9個變量的多項式分佈,而後對該多項式分佈採樣便可,theano中有直接的multinomial()來函數完成。固然也能夠本身用01均勻分佈來採樣,將單位長度1按照那9個機率值分紅9個區間(機率越大,覆蓋的區域越長,每一個區間對應一個位置),然隨機生成一個數後看它落在哪一個區間。dom
好比若是隨機採樣後的矩陣爲:ide
則這時候的poolng值爲1.5函數
使用stochastic pooling時(即test過程),其推理過程也很簡單,對矩陣區域求加權平均便可。好比對上面的例子求值過程爲爲:ui
0*0+1.1*0.11+2.5*0.25+0.9*0.09+2.0*0.2+1.0*0.1+0*0+1.5*0.15+1.0*0.1=1.625 說明此時對小矩形pooling後的結果爲1.625.spa
在反向傳播求導時,只需保留前向傳播已經記錄被選中節點的位置的值,其它值都爲0,這和max-pooling的反向傳播很是相似。
Stochastic pooling優勢:
方法簡單;
泛化能力更強;
可用於卷積層(文章中是與Dropout和DropConnect對比的,說是Dropout和DropConnect不太適合於卷積層. 不過我的感受這沒什麼可比性,由於它們在網絡中所處理的結構不一樣);
至於爲何stochastic pooling效果好,做者說該方法也是模型平均的一種,沒怎麼看懂。
關於Stochastic Pooling的前向傳播過程和推理過程的代碼可參考(沒包括bp過程,因此代碼中pooling選擇的位置沒有保存下來)
源碼:pylearn2/stochastic_pool.py
""" An implementation of stochastic max-pooling, based on Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler, Rob Fergus, ICLR 2013 """ __authors__ = "Mehdi Mirza" __copyright__ = "Copyright 2010-2012, Universite de Montreal" __credits__ = ["Mehdi Mirza", "Ian Goodfellow"] __license__ = "3-clause BSD" __maintainer__ = "Mehdi Mirza" __email__ = "mirzamom@iro" import numpy import theano from theano import tensor from theano.sandbox.rng_mrg import MRG_RandomStreams as RandomStreams from theano.gof.op import get_debug_values def stochastic_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None): """ Stochastic max pooling for training as defined in: Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler, Rob Fergus bc01: minibatch in format (batch size, channels, rows, cols), IMPORTANT: All values should be poitivie pool_shape: shape of the pool region (rows, cols) pool_stride: strides between pooling regions (row stride, col stride) image_shape: avoid doing some of the arithmetic in theano rng: theano random stream """ r, c = image_shape pr, pc = pool_shape rs, cs = pool_stride batch = bc01.shape[0] #總共batch的個數 channel = bc01.shape[1] #通道個數 if rng is None: rng = RandomStreams(2022) # Compute index in pooled space of last needed pool # (needed = each input pixel must appear in at least one pool) def last_pool(im_shp, p_shp, p_strd): rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd)) assert p_strd * rval + p_shp >= im_shp assert p_strd * (rval - 1) + p_shp < im_shp return rval #表示pool過程當中須要移動的次數 return T.dot(x, self._W) # Compute starting row of the last pool last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0] #最後一個pool的起始位置 # Compute number of rows needed in image for all indexes to work out required_r = last_pool_r + pr #知足上面pool條件時所須要image的高度 last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1] required_c = last_pool_c + pc # final result shape res_r = int(numpy.floor(last_pool_r/rs)) + 1 #最後pool完成時圖片的shape res_c = int(numpy.floor(last_pool_c/cs)) + 1 for bc01v in get_debug_values(bc01): assert not numpy.any(numpy.isinf(bc01v)) assert bc01v.shape[2] == image_shape[0] assert bc01v.shape[3] == image_shape[1] # padding,若是不能整除移動,須要對原始圖片進行擴充 padded = tensor.alloc(0.0, batch, channel, required_r, required_c) name = bc01.name if name is None: name = 'anon_bc01' bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01) bc01.name = 'zero_padded_' + name # unraveling window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc) window.name = 'unravlled_winodows_' + name for row_within_pool in xrange(pool_shape[0]): row_stop = last_pool_r + row_within_pool + 1 for col_within_pool in xrange(pool_shape[1]): col_stop = last_pool_c + col_within_pool + 1 win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs] window = tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) #windows中裝的是全部的pooling數據塊 # find the norm norm = window.sum(axis = [4, 5]) #求和當分母用 norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm) #若是norm爲0,則將norm賦值爲1 norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x') #除以norm獲得每一個位置的機率 # get prob prob = rng.multinomial(pvals = norm.reshape((batch * channel * res_r * res_c, pr * pc)), dtype='float32') #multinomial()函數可以按照pvals產生多個多項式分佈,元素值爲0或1 # select res = (window * prob.reshape((batch, channel, res_r, res_c, pr, pc))).max(axis=5).max(axis=4) #window和後面的矩陣相乘是點乘,即對應元素相乘,numpy矩陣符號 res.name = 'pooled_' + name return tensor.cast(res, theano.config.floatX) def weighted_max_pool_bc01(bc01, pool_shape, pool_stride, image_shape, rng = None): """ This implements test time probability weighted pooling defined in: Stochastic Pooling for Regularization of Deep Convolutional Neural Networks Matthew D. Zeiler, Rob Fergus bc01: minibatch in format (batch size, channels, rows, cols), IMPORTANT: All values should be poitivie pool_shape: shape of the pool region (rows, cols) pool_stride: strides between pooling regions (row stride, col stride) image_shape: avoid doing some of the arithmetic in theano """ r, c = image_shape pr, pc = pool_shape rs, cs = pool_stride batch = bc01.shape[0] channel = bc01.shape[1] if rng is None: rng = RandomStreams(2022) # Compute index in pooled space of last needed pool # (needed = each input pixel must appear in at least one pool) def last_pool(im_shp, p_shp, p_strd): rval = int(numpy.ceil(float(im_shp - p_shp) / p_strd)) assert p_strd * rval + p_shp >= im_shp assert p_strd * (rval - 1) + p_shp < im_shp return rval # Compute starting row of the last pool last_pool_r = last_pool(image_shape[0] ,pool_shape[0], pool_stride[0]) * pool_stride[0] # Compute number of rows needed in image for all indexes to work out required_r = last_pool_r + pr last_pool_c = last_pool(image_shape[1] ,pool_shape[1], pool_stride[1]) * pool_stride[1] required_c = last_pool_c + pc # final result shape res_r = int(numpy.floor(last_pool_r/rs)) + 1 res_c = int(numpy.floor(last_pool_c/cs)) + 1 for bc01v in get_debug_values(bc01): assert not numpy.any(numpy.isinf(bc01v)) assert bc01v.shape[2] == image_shape[0] assert bc01v.shape[3] == image_shape[1] # padding padded = tensor.alloc(0.0, batch, channel, required_r, required_c) name = bc01.name if name is None: name = 'anon_bc01' bc01 = tensor.set_subtensor(padded[:,:, 0:r, 0:c], bc01) bc01.name = 'zero_padded_' + name # unraveling window = tensor.alloc(0.0, batch, channel, res_r, res_c, pr, pc) window.name = 'unravlled_winodows_' + name for row_within_pool in xrange(pool_shape[0]): row_stop = last_pool_r + row_within_pool + 1 for col_within_pool in xrange(pool_shape[1]): col_stop = last_pool_c + col_within_pool + 1 win_cell = bc01[:,:,row_within_pool:row_stop:rs, col_within_pool:col_stop:cs] window = tensor.set_subtensor(window[:,:,:,:, row_within_pool, col_within_pool], win_cell) # find the norm norm = window.sum(axis = [4, 5]) norm = tensor.switch(tensor.eq(norm, 0.0), 1.0, norm) norm = window / norm.dimshuffle(0, 1, 2, 3, 'x', 'x') # average res = (window * norm).sum(axis=[4,5]) #前面的代碼幾乎和前向傳播代碼同樣,這裏只需加權求和便可 res.name = 'pooled_' + name return res.reshape((batch, channel, res_r, res_c))
參考資料:
Stochastic Pooling for Regularization of Deep Convolutional Neural Networks. Matthew D. Zeiler, Rob Fergus.