####tf.nn.conv2d 在使用TF搭建CNN的過程當中,卷積的操做以下api
convolution = tf.nn.conv2d(X, filters, strides=[1,2,2,1], padding="SAME")
這個函數中各個參數的含義是什麼呢?網絡
- X:輸入數據的mini-batch,爲一個4D tensor;分別表示的含義爲[n_batch,height,width,channel]
- filters:爲卷積核,爲一個4D tensor,分別表示的含義爲 [filter_height, filter_width, in_channels, out_channels]
- stride:爲步長,使用方法爲[1,stride,stride,1] 該方法先將filter展開爲一個2D的矩陣,形狀爲[filter_heightfilter_width in_channels, out_channels],再在圖片上面選擇一塊大小進行卷積計算的到一個大小爲[batch, out_height, out_width, filter_height * filter_width * in_channels]的虛擬張量。 再將上面兩部相乘(右乘filter矩陣)
- padding:string類型的量,只能是"SAME","VALID"其中之一,這個值決定了不一樣的卷積方式。下面使用圖表示兩種的計算形式
當使用VALID
的時候,若是卷積計算過程當中,剩下的不夠一步,則剩下的像素會被拋棄,SAME
則會補0.app
filter_primes = np.array([2., 3., 5., 7., 11., 13.], dtype=np.float32) x = tf.constant(np.arange(1, 13+1, dtype=np.float32).reshape([1, 1, 13, 1])) filters = tf.constant(filter_primes.reshape(1, 6, 1, 1)) valid_conv = tf.nn.conv2d(x, filters, strides=[1, 1, 5, 1], padding='VALID') same_conv = tf.nn.conv2d(x, filters, strides=[1, 1, 5, 1], padding='SAME') with tf.Session() as sess: print("VALID:\n", valid_conv.eval()) print("SAME:\n", same_conv.eval())
輸出內容爲dom
VALID: [[[[ 184.] [ 389.]]]] SAME: [[[[ 143.] [ 348.] [ 204.]]]]
實際計算向量以下所示:ide
print("VALID:") print(np.array([1,2,3,4,5,6]).T.dot(filter_primes)) print(np.array([6,7,8,9,10,11]).T.dot(filter_primes)) print("SAME:") print(np.array([0,1,2,3,4,5]).T.dot(filter_primes)) print(np.array([5,6,7,8,9,10]).T.dot(filter_primes)) print(np.array([10,11,12,13,0,0]).T.dot(filter_primes)) >> VALID: 184.0 389.0 SAME: 143.0 348.0 204.0
再來作一個小實驗,使用VALID
的時候:函數
input = tf.Variable(tf.random_normal([1,5,5,5])) filter = tf.Variable(tf.random_normal([3,3,5,1])) op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='VALID') init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) print(op) # print(sess.run(op)) >>Tensor("Conv2D:0", shape=(1, 2, 2, 1), dtype=float32)
使用SAME
的時候code
input = tf.Variable(tf.random_normal([1,5,5,5])) filter = tf.Variable(tf.random_normal([3,3,5,1])) op = tf.nn.conv2d(input, filter, strides=[1, 2, 2, 1], padding='SAME') init = tf.global_variables_initializer() with tf.Session() as sess: sess.run(init) print(op) # print(sess.run(op)) >>Tensor("Conv2D:0", shape=(1, 3, 3, 1), dtype=float32)
**note:**在作卷積的過程當中filter的shape爲[hight,width,channel],也就是說若是爲若是輸入只有一個channel的時候,filter爲一個矩陣,若是channel爲3的時候,這個時候的filter就有了厚度
爲3。orm
####tf.layer.conv2d 同時TF也提供了tf.layer.conv2d的方法blog
def conv2d(inputs, filters, kernel_size, strides=(1, 1), padding='valid', data_format='channels_last', dilation_rate=(1, 1), activation=None, use_bias=True, kernel_initializer=None, bias_initializer=init_ops.zeros_initializer(), kernel_regularizer=None, bias_regularizer=None, activity_regularizer=None, trainable=True, name=None, reuse=None):
這個方法和tf.nn.conv2d
有着相同的做用,至關於對其的更高層的api。兩個方法的調用過程以下:圖片
tf.layers.conv2d-> tf.nn.convolution . tf.layers.conv2d->Conv2D->Conv2D.apply()->_Conv->_Conv.apply()->_Layer.apply()->_Layer.\__call__()->_Conv.call()->nn.convolution()...
我用這兩個方法搭建了相同的神經網絡,但是獲得的準確率相差很大,其餘部分代碼一張樣。代碼和準確率以下。爲什麼差異這麼的大?
def conv2d(self,input,ksize,stride,name): with tf.name_scope(name): with tf.variable_scope(name): w = tf.get_variable("%s-w" %name,shape= ksize,initializer=tf.truncated_normal_initializer()) b = tf.get_variable("%s-b" %name,shape = [ksize[-1]],initializer = tf.constant_initializer()) out = tf.nn.conv2d(input,w,strides=[1,stride,stride,1],padding="SAME",name="%s-conv"%name) out = tf.nn.bias_add(out,b,name='%s-bias_add' %name) out = tf.nn.relu(out,name="%s-relu"%name) return out
conv1 = tf.layers.conv2d(X,filters=conv1_fmaps, \ kernel_size = conv1_ksize,strides=conv1_stride,\ padding=conv1_pad,activation=tf.nn.relu,name='conv1')
爲什麼差別這麼大呢?我如今還沒弄查出結果,若是知道答案請指出
,先謝過。
####tf.layers.conv2d中默認的kernel_initializer tf.layer.conv2d
這裏面默認的kernel_initializer
爲None,經查閱源碼
self.kernel = vs.get_variable('kernel', shape=kernel_shape, initializer=self.kernel_initializer, regularizer=self.kernel_regularizer, trainable=True, dtype=self.dtype)
這裏面有一段說明
If initializer is `None` (the default), the default initializer passed in the constructor is used. If that one is `None` too, we use a new `glorot_uniform_initializer`. If initializer is a Tensor, we use it as a value and derive the shape from the initializer.
也就是說使用的是 glorot_uniform_initializer
來進行初始化的。這種方法又被稱爲Xavier uniform initializer
,相關的文獻在這裏 。另外TF中tf.layers.dense
也是使用的這個初始化方法。我把初始化方法都改爲了使用tf.truncated_normal_initializer
,上面模型的結果沒有什麼改善。看來初始化方法不是主要緣由。求解。