做爲一個小白中的小白,多折騰老是有好處的,看了入門書和往上一些教程,不少TF的教程都是從MNIST數據集入手教小白入TF的大門,都是直接import MNIST,而後直接構建網絡,定義loss和optimizer,設置超參數,以後就直接sess.run()了,雖然操做流程看上去很簡單,但若是直接給本身一堆圖片,如何讓tensorflow讀取,如何喂入網絡進行訓練,這些都不清楚,因此做爲小白,先從最簡單的CNN——VGGnet入手吧,在網上隨便下載了個數據集——GTSRB(由於這個數據集最小,下載快。。= =),下載下來的數據的前處理已經在另外一篇博文數據圖片處理介紹,這篇主要是TFrecords文件的製做和讀取,我不是CS專業,研究方向也跟這個絕不相關,(剛入學時和導師約定好的計算機視覺方向如今被否了,一度讓我想換導師,說來話長,此處省略一萬字),一邊要忙導師那邊的東西,一邊搞這個,能夠說是很酸爽了 = =。。。這個程序折騰了近2個星期,最後可算是制服全部八阿哥,成功運行了,進入了所謂的「調參」環節,目前還很不理想,也許下面的程序還存在錯誤,但對於我這個小白來說此次折騰已經學到不少東西了。html
下面進入正題。。。python
TFrecords文件是tensorflow讀取數據的方式之一,主要用於數據較大的狀況,TFRecords文件包含了tf.train.Example
協議內存塊(protocol buffer)(協議內存塊包含了字段 Features
),git
能夠將本身的數據填入到Example
協議內存塊(protocol buffer),將協議內存塊序列化爲一個字符串, 而且經過tf.python_io.TFRecordWriter
寫入到TFRecords文件。網絡
從TFRecords文件中讀取數據, 可使用tf.TFRecordReader
的tf.parse_single_example
解析器。這個操做能夠將Example
協議內存塊(protocol buffer)解析爲張量。dom
上面的內容來自:http://www.javashuo.com/article/p-unqlxzti-cx.htmlide
下面直接貼代碼吧,有些部分並不是原創,不少說明都寫在代碼中了(好吧,我認可我懶。。= =,這篇之後會更新的)函數
VGGnet.py:優化
1 # -*- coding: utf-8 -*- 2 import tensorflow as tf 3 import time 4 import convert_TFrecords 5 6 # 網絡超參數 7 learning_rate = 0.005 8 batch_size = 300 9 epoch = 20000 10 display_step = 10 11 12 # 網絡參數 13 Dropout = 0.75 # 失活的機率=1-Dropout 14 n_inputs = 128 * 128 * 3 # 輸入維度(img_size) 15 n_classes = 43 16 17 18 weights = {'w1': tf.Variable(tf.random_normal([3, 3, 3, 16])), 19 'w2': tf.Variable(tf.random_normal([3, 3, 16, 16])), 20 'w3': tf.Variable(tf.random_normal([3, 3, 16, 32])), 21 'w4': tf.Variable(tf.random_normal([3, 3, 32, 32])), 22 'w5': tf.Variable(tf.random_normal([3, 3, 32, 64])), 23 'w6': tf.Variable(tf.random_normal([3, 3, 64, 64])), 24 'w7': tf.Variable(tf.random_normal([3, 3, 64, 128])), 25 'w8': tf.Variable(tf.random_normal([3, 3, 128, 128])), 26 'w9': tf.Variable(tf.random_normal([3, 3, 128, 128])), 27 'w10': tf.Variable(tf.random_normal([3, 3, 128, 128])), 28 'wd1': tf.Variable(tf.random_normal([8*8*128, 4096])), 29 'wd2': tf.Variable(tf.random_normal([1*1*4096, 4096])), 30 'out': tf.Variable(tf.random_normal([4096, 43]))} # 共43個類別 31 32 biases = {'b1': tf.Variable(tf.random_normal([16])), 33 'b2': tf.Variable(tf.random_normal([16])), 34 'b3': tf.Variable(tf.random_normal([32])), 35 'b4': tf.Variable(tf.random_normal([32])), 36 'b5': tf.Variable(tf.random_normal([64])), 37 'b6': tf.Variable(tf.random_normal([64])), 38 'b7': tf.Variable(tf.random_normal([128])), 39 'b8': tf.Variable(tf.random_normal([128])), 40 'b9': tf.Variable(tf.random_normal([128])), 41 'b10': tf.Variable(tf.random_normal([128])), 42 'bd1': tf.Variable(tf.random_normal([4096])), 43 'bd2': tf.Variable(tf.random_normal([4096])), 44 'out': tf.Variable(tf.random_normal([43]))} 45 46 47 def conv(name, input, W, b, strides=1, padding='SAME'): 48 x = tf.nn.conv2d(input, W, strides=[1, strides, strides, 1], padding=padding) 49 x = tf.nn.bias_add(x, b) 50 return tf.nn.relu(x, name=name) 51 52 53 # 輸入應該是一個4維的張量,最後一維爲batch_size,但這裏構造的網絡只按batch_size=1的狀況來構造,即只考慮 54 # 一個樣本的狀況,這是沒有影響的,運行圖的時候再指定batch_size 55 def VGGnet(input, weights, biases, keep_prob): 56 x = tf.reshape(input, shape=[-1, 128, 128, 3]) # -1處的值由batch_size決定 57 conv1 = conv('conv1', x, weights['w1'], biases['b1']) 58 59 conv2 = conv('conv2', conv1, weights['w2'], biases['b1']) 60 61 pool1 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool1') 62 63 conv3 = conv('conv3', pool1, weights['w3'], biases['b3']) 64 65 conv4 = conv('conv4', conv3, weights['w4'], biases['b4']) 66 67 pool2 = tf.nn.max_pool(conv4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool2') 68 69 conv5 = conv('conv5', pool2, weights['w5'], biases['b5']) 70 71 conv6 = conv('conv6', conv5, weights['w6'], biases['b6']) 72 73 conv7 = conv('conv7', conv6, weights['w7'], biases['b7']) 74 75 pool3 = tf.nn.max_pool(conv7, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool3') 76 77 conv8 = conv('conv8', pool3, weights['w8'], biases['b8']) 78 79 conv9 = conv('conv9', conv8, weights['w9'], biases['b9']) 80 81 conv10 = conv('conv10', conv9, weights['w10'], biases['b10']) 82 83 pool4 = tf.nn.max_pool(conv10, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool4') 84 85 fc1 = tf.reshape(pool4, [-1, weights['wd1'].get_shape().as_list()[0]]) 86 fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1']) 87 88 re1 = tf.nn.relu(fc1, 're1') 89 90 drop1 = tf.nn.dropout(re1, keep_prob) 91 92 fc2 = tf.reshape(drop1, [-1, weights['wd2'].get_shape().as_list()[0]]) 93 fc2 = tf.add(tf.matmul(fc2, weights['wd2']), biases['bd2']) 94 95 re2 = tf.nn.relu(fc2, 're2') 96 97 drop2 = tf.nn.dropout(re2, keep_prob) 98 99 fc3 = tf.reshape(drop2, [-1, weights['out'].get_shape().as_list()[0]]) 100 fc3 = tf.add(tf.matmul(fc3, weights['out']), biases['out']) 101 102 # print(fc3) 檢查點 103 104 # tf.nn.softmax_cross_entropy_with_logits函數已經進行了softmax處理!沒必要再加一層softmax(發現這個錯誤後,訓練精度終於變得正常) 105 # sm = tf.nn.softmax(fc3) 106 107 return fc3 108 109 110 # 注意下面的shape要和傳入的tensor一致!使用mnist數據集時x的shape爲[none, 28*28*1],是由於傳入的數據是展開成行的 111 x = tf.placeholder(tf.float32, [None, 128, 128, 3]) 112 y = tf.placeholder(tf.float32, [None, n_classes]) 113 dropout = tf.placeholder(tf.float32) 114 115 pred = VGGnet(x, weights, biases, dropout) 116 117 # 定義損失函數和優化器 118 # 錯誤:Only call `softmax_cross_entropy_with_logits` with named arguments (labels=..., logits=...,),解決方法:參數要以關鍵字參數的形式傳入 119 # tf.nn.softmax_cross_entropy_with_logits先是對最後一層輸出作一個softmax,而後求softmax向量裏每一個元素的這個值:y_i * log(yi)(y_i爲實際值,yi爲預測值), 120 # tf.reduce_mean對每一個元素上面的乘積求和再平均 121 # 參考:https://blog.csdn.net/mao_xiao_feng/article/details/53382790 122 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y)) 123 optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss) 124 125 # 評估函數 126 # tf.argmax()返回每一個向量最大元素的索引(axis=1),tf.equal()返回兩個數是否相等(ture or false) 127 # https://blog.csdn.net/qq575379110/article/details/70538051/ 128 # https://blog.csdn.net/uestc_c2_403/article/details/72232924 129 correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) 130 accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32)) 131 132 # init = tf.initialize_all_variables() 133 batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch) 134 135 with tf.Session() as sess: 136 # sess.run(init) 137 # 先執行初始化工做 138 # 參考:https://blog.csdn.net/lujiandong1/article/details/53376802 139 sess.run(tf.global_variables_initializer()) 140 sess.run(tf.local_variables_initializer()) 141 # sess.run(tf.initialize_all_variables()) 142 143 # 開啓一個協調器 144 coord = tf.train.Coordinator() 145 # 使用start_queue_runners 啓動隊列填充 146 threads = tf.train.start_queue_runners(sess, coord) 147 148 try: 149 step = 1 150 while not coord.should_stop(): 151 # 獲取每個batch中batch_size個樣本和標籤 152 # 原來下面這一句放在這個位置(改變這一句的位置後卡了幾天的問及終於解決了): 153 # batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch) 154 # 結果程序卡住,沒法運行,也不報錯 155 # 檢查點:print('kaka') 156 157 # print(batch_x) 158 # print(batch_y) 159 # print('okok') 檢查點 160 # 沒有下面這句會報錯: 161 # The value of a feed cannot be a tf.Tensor object. Acceptable feed 162 # values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles. 163 # 原覺得是要用tensor.eval()將tensor轉爲np.array,但batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch) 164 # 那時是放在sess裏面,因此執行到tensor.eval()時同樣會卡住不動 165 b_x, b_y = sess.run([batch_x, batch_y]) 166 # print('haha') 檢查點 167 # 打印出tesor:默認值打印出3個參數 參考:https://blog.csdn.net/qq_34484472/article/details/75049179 168 # print(b_x, b_y) 檢查點 169 # 這裏原先喂入dict的tensor變量名不是b_x,b_y,而是和key名同樣(也就是x,y),變量名與佔位符名衝突,結果 170 # 會報錯:unhashable type: 'numpy.ndarray' error 171 # 這個錯誤也有多是其餘緣由引發,見:https://blog.csdn.net/wongleetion/article/details/80885648 172 start = time.time() 173 sess.run(optimizer, feed_dict={x: b_x, y: b_y, dropout: Dropout}) 174 if step % display_step == 0: 175 # 原來在feed_dict裏關鍵字dropout打錯成keep_prob了,結果彈出Cannot interpret feed_dict key 176 # as Tensor:Can not convert a float into a Tensor錯誤 177 # 參考https://blog.csdn.net/ice_pill/article/details/78567841 178 Loss, acc = sess.run([loss, accuracy], feed_dict={x: b_x, y: b_y, dropout: 1.0}) 179 print('iter ' + str(step) + ', minibatch loss = ' + 180 '{: .6f}'.format(Loss) + ', training accuracy = ' + '{: .5f}'.format(acc)) 181 # sess.run(tf.Print(b_y, [b_y], summarize=43)) 182 print(b_y) 183 print('iter %d, duration: %.2fs' % (step, time.time() - start)) 184 step += 1 185 except tf.errors.OutOfRangeError: # 若是讀取到文件隊列末尾會拋出此異常 186 print("done! now lets kill all the threads……") 187 finally: 188 # 協調器coord發出全部線程終止信號 189 coord.request_stop() 190 print('all threads are asked to stop!') 191 coord.join(threads) # 把開啓的線程加入主線程,等待threads結束 192 print('all threads are stopped!')
convert_TFrecords.py(TFrecords文件的製做和讀取):
1 # -*- coding: utf-8 -*- 2 3 import os 4 import tensorflow as tf 5 from PIL import Image 6 7 cur_dir = os.getcwd() 8 9 # classes = ['test_file_dir', 'train_file_dir'] 10 train_set = os.path.join(cur_dir, 'train_file_dir') 11 classes = os.listdir(train_set) 12 13 14 # 製做二進制數據 15 def create_record(): 16 print('processing...') 17 writer = tf.python_io.TFRecordWriter('train.tfrecords') 18 num_labels = len([name for name in classes]) 19 print('num of classes: %d' % num_labels) 20 label = [0] * num_labels 21 for index, name in enumerate(classes): 22 class_path = os.path.join(train_set, name) 23 label[index] = 1 24 for img_name in os.listdir(class_path): 25 img_path = os.path.join(class_path, img_name) 26 img = Image.open(img_path) 27 # img = img.resize((64, 64)) 28 img_raw = img.tobytes() # 將圖片轉化爲原生bytes 29 # print(img_raw) 30 # print(index,img_raw) 31 # tfrecord數據文件是一種將圖像數據和標籤統一存儲的二進制文件,能更好的利用內存,在tensorflow中快速的複製,移動,讀取,存儲等。 32 # tfrecord文件包含了tf.train.Example 協議緩衝區(protocol buffer,協議緩衝區包含了特徵 Features)。你能夠寫一段代碼獲取你的數據, 33 # 將數據填入到Example協議緩衝區(protocol buffer),將協議緩衝區序列化爲一個字符串, 而且經過tf.python_io.TFRecordWriter class寫 34 # 入到TFRecords文件 35 example = tf.train.Example( 36 # feature字典中每一個key的值都是一個list,這些list是3種數據類型中的一種:FloatList, 或者ByteList,或者Int64List 37 # 參考https://blog.csdn.net/u012759136/article/details/52232266 38 # 參考https://blog.csdn.net/shenxiaolu1984/article/details/52857437 39 features=tf.train.Features(feature={ 40 # 設置圖片在TFrecord文件中的標籤(同一文件夾下標籤一致),注意存儲的是一個大小爲num_label的list,而不是一個值!! 41 'label': tf.train.Feature(int64_list=tf.train.Int64List(value=label)), # label原本就是一個list,不用加中括號 42 # 設置圖片在TFrecord文件中的值 43 'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw])) 44 })) 45 writer.write(example.SerializeToString()) 46 label = [0] * num_labels 47 writer.close() 48 print('TFrecords file created successfully!!') 49 50 51 # 讀取二進制數據 52 def read_and_decode(filename, num_epochs): 53 # 根據文件名,順序生成一個隊列(若是shuffle=ture) 54 filename_queue = tf.train.string_input_producer([filename], shuffle=True, num_epochs=num_epochs) 55 print('qunide') 56 reader = tf.TFRecordReader() 57 _, serialized_example = reader.read(filename_queue) # 返回文件名和文件 58 features = tf.parse_single_example(serialized_example, 59 features={ 60 # 這個函數不是很瞭解,原來在'label'裏的shape爲空([]),結果彈出錯誤:Key: label, Index: 0. Number 61 # of int64 values != expected. Values size: 43 but output shape: [] 62 # 注意數據類型要和TFrecords文件中一致!! 63 'label': tf.FixedLenFeature([43], tf.int64), 64 'img_raw': tf.FixedLenFeature([], tf.string), ########## 65 }) 66 67 img = features['img_raw'] 68 # decode_raw()函數只能用於解碼byteslist格式的數據 69 img = tf.decode_raw(img, tf.uint8) 70 img = tf.reshape(img, [128, 128, 3]) 71 img = tf.cast(img, tf.float32) * (1. / 255) - 0.5 # 規範化到±0.5之間 72 label = features['label'] 73 # label = tf.reshape(label, [43]) ????不用這樣作,本來存儲的時候shape就是[43] 74 label = tf.cast(label, tf.float32) # 由於網絡輸出的pred值是float32類型的!!(?) 75 print('label', label) 76 print('image', img) 77 78 return img, label 79 80 81 def inputs(train, batch_size, num_epochs): 82 print('qunide2') 83 if not num_epochs: 84 num_epochs = None 85 filename = os.path.join(cur_dir, 'train.tfrecords' if train else 'test.tfrecords') # 暫時先這樣 86 87 with tf.name_scope('input'): 88 image, label = read_and_decode(filename, num_epochs) 89 # print(image) 檢查點 90 # tf.train.shuffle_batch應該是從tf.train.string_input_producer生成的文件隊列中先打亂再從中抽取組成batch,因此 91 # 這個打亂後的隊列容量和min_after_dequeue(應該是決定原有隊列被抽取後的最小樣本含量,決定被抽取後再填入的量) 92 # 根據batch_size的不一樣會影響訓練精度(由於再填充並打亂後不少以前網絡沒見過的樣本會被送入,當全部訓練數據都過一遍後,精度會提升),這是個人我的猜想 93 images, sparse_labels = tf.train.shuffle_batch([image, label], batch_size=batch_size, 94 num_threads=2, capacity=3000, # 線程數通常與處理器核數同樣 95 # 但並非線程越多越快,甚至更多的線程反而會使效率降低 96 # 參考:https://blog.csdn.net/lujiandong1/article/details/53376802 97 # https://blog.csdn.net/heiheiya/article/details/80967301 98 min_after_dequeue=2000) 99 # print(images) 檢查點 100 return images, sparse_labels 101 # 注意返回值的類型要與tf.placeholder()中的dtypes, shape都要相同! 102 103 104 if __name__ == '__main__': 105 create_record()
雖然程序成功運行了,但訓練精度很低,還有不少方面須要調整ui
除了代碼中提到的博文,還參考了下面的:spa
https://blog.csdn.net/dcrmg/article/details/79780331
https://blog.csdn.net/qq_30666517/article/details/79715045
http://www.javashuo.com/article/p-unqlxzti-cx.html
https://blog.csdn.net/tengxing007/article/details/56847828