VGGnet——從TFrecords製做到網絡訓練

時間 2019-12-01

原文原文鏈接

做爲一個小白中的小白，多折騰老是有好處的，看了入門書和往上一些教程，不少TF的教程都是從MNIST數據集入手教小白入TF的大門，都是直接import MNIST，而後直接構建網絡，定義loss和optimizer，設置超參數，以後就直接sess.run()了，雖然操做流程看上去很簡單，但若是直接給本身一堆圖片，如何讓tensorflow讀取，如何喂入網絡進行訓練，這些都不清楚，因此做爲小白，先從最簡單的CNN——VGGnet入手吧，在網上隨便下載了個數據集——GTSRB（由於這個數據集最小，下載快。。= =），下載下來的數據的前處理已經在另外一篇博文數據圖片處理介紹，這篇主要是TFrecords文件的製做和讀取，我不是CS專業，研究方向也跟這個絕不相關，（剛入學時和導師約定好的計算機視覺方向如今被否了，一度讓我想換導師，說來話長，此處省略一萬字），一邊要忙導師那邊的東西，一邊搞這個，能夠說是很酸爽了 = =。。。這個程序折騰了近2個星期，最後可算是制服全部八阿哥，成功運行了，進入了所謂的「調參」環節，目前還很不理想，也許下面的程序還存在錯誤，但對於我這個小白來說此次折騰已經學到不少東西了。html

下面進入正題。。。python

TFrecords文件是tensorflow讀取數據的方式之一，主要用於數據較大的狀況，TFRecords文件包含了tf.train.Example 協議內存塊(protocol buffer)(協議內存塊包含了字段 Features)，git

能夠將本身的數據填入到Example協議內存塊(protocol buffer)，將協議內存塊序列化爲一個字符串，而且經過tf.python_io.TFRecordWriter 寫入到TFRecords文件。網絡

從TFRecords文件中讀取數據，可使用tf.TFRecordReader的tf.parse_single_example解析器。這個操做能夠將Example協議內存塊(protocol buffer)解析爲張量。dom

上面的內容來自：http://www.javashuo.com/article/p-unqlxzti-cx.htmlide

下面直接貼代碼吧，有些部分並不是原創，不少說明都寫在代碼中了（好吧，我認可我懶。。= =，這篇之後會更新的）函數

VGGnet.py：優化

  1 # -*- coding: utf-8 -*-
  2 import tensorflow as tf
  3 import time
  4 import convert_TFrecords
  5 
  6 # 網絡超參數
  7 learning_rate = 0.005
  8 batch_size = 300
  9 epoch = 20000
 10 display_step = 10
 11 
 12 # 網絡參數
 13 Dropout = 0.75  # 失活的機率=1-Dropout
 14 n_inputs = 128 * 128 * 3  # 輸入維度(img_size)
 15 n_classes = 43
 16 
 17 
 18 weights = {'w1': tf.Variable(tf.random_normal([3, 3, 3, 16])),
 19            'w2': tf.Variable(tf.random_normal([3, 3, 16, 16])),
 20            'w3': tf.Variable(tf.random_normal([3, 3, 16, 32])),
 21            'w4': tf.Variable(tf.random_normal([3, 3, 32, 32])),
 22            'w5': tf.Variable(tf.random_normal([3, 3, 32, 64])),
 23            'w6': tf.Variable(tf.random_normal([3, 3, 64, 64])),
 24            'w7': tf.Variable(tf.random_normal([3, 3, 64, 128])),
 25            'w8': tf.Variable(tf.random_normal([3, 3, 128, 128])),
 26            'w9': tf.Variable(tf.random_normal([3, 3, 128, 128])),
 27            'w10': tf.Variable(tf.random_normal([3, 3, 128, 128])),
 28            'wd1': tf.Variable(tf.random_normal([8*8*128, 4096])),
 29            'wd2': tf.Variable(tf.random_normal([1*1*4096, 4096])),
 30            'out': tf.Variable(tf.random_normal([4096, 43]))}  # 共43個類別
 31 
 32 biases = {'b1': tf.Variable(tf.random_normal([16])),
 33           'b2': tf.Variable(tf.random_normal([16])),
 34           'b3': tf.Variable(tf.random_normal([32])),
 35           'b4': tf.Variable(tf.random_normal([32])),
 36           'b5': tf.Variable(tf.random_normal([64])),
 37           'b6': tf.Variable(tf.random_normal([64])),
 38           'b7': tf.Variable(tf.random_normal([128])),
 39           'b8': tf.Variable(tf.random_normal([128])),
 40           'b9': tf.Variable(tf.random_normal([128])),
 41           'b10': tf.Variable(tf.random_normal([128])),
 42           'bd1': tf.Variable(tf.random_normal([4096])),
 43           'bd2': tf.Variable(tf.random_normal([4096])),
 44           'out': tf.Variable(tf.random_normal([43]))}
 45 
 46 
 47 def conv(name, input, W, b, strides=1, padding='SAME'):
 48     x = tf.nn.conv2d(input, W, strides=[1, strides, strides, 1], padding=padding)
 49     x = tf.nn.bias_add(x, b)
 50     return tf.nn.relu(x, name=name)
 51 
 52 
 53 # 輸入應該是一個4維的張量，最後一維爲batch_size，但這裏構造的網絡只按batch_size=1的狀況來構造，即只考慮
 54 # 一個樣本的狀況，這是沒有影響的，運行圖的時候再指定batch_size
 55 def VGGnet(input, weights, biases, keep_prob):
 56     x = tf.reshape(input, shape=[-1, 128, 128, 3])   # -1處的值由batch_size決定
 57     conv1 = conv('conv1', x, weights['w1'], biases['b1'])
 58 
 59     conv2 = conv('conv2', conv1, weights['w2'], biases['b1'])
 60 
 61     pool1 = tf.nn.max_pool(conv2, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool1')
 62 
 63     conv3 = conv('conv3', pool1, weights['w3'], biases['b3'])
 64 
 65     conv4 = conv('conv4', conv3, weights['w4'], biases['b4'])
 66 
 67     pool2 = tf.nn.max_pool(conv4, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool2')
 68 
 69     conv5 = conv('conv5', pool2, weights['w5'], biases['b5'])
 70 
 71     conv6 = conv('conv6', conv5, weights['w6'], biases['b6'])
 72 
 73     conv7 = conv('conv7', conv6, weights['w7'], biases['b7'])
 74 
 75     pool3 = tf.nn.max_pool(conv7, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool3')
 76 
 77     conv8 = conv('conv8', pool3, weights['w8'], biases['b8'])
 78 
 79     conv9 = conv('conv9', conv8, weights['w9'], biases['b9'])
 80 
 81     conv10 = conv('conv10', conv9, weights['w10'], biases['b10'])
 82 
 83     pool4 = tf.nn.max_pool(conv10, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool4')
 84 
 85     fc1 = tf.reshape(pool4, [-1, weights['wd1'].get_shape().as_list()[0]])
 86     fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
 87 
 88     re1 = tf.nn.relu(fc1, 're1')
 89 
 90     drop1 = tf.nn.dropout(re1, keep_prob)
 91 
 92     fc2 = tf.reshape(drop1, [-1, weights['wd2'].get_shape().as_list()[0]])
 93     fc2 = tf.add(tf.matmul(fc2, weights['wd2']), biases['bd2'])
 94 
 95     re2 = tf.nn.relu(fc2, 're2')
 96 
 97     drop2 = tf.nn.dropout(re2, keep_prob)
 98 
 99     fc3 = tf.reshape(drop2, [-1, weights['out'].get_shape().as_list()[0]])
100     fc3 = tf.add(tf.matmul(fc3, weights['out']), biases['out'])
101 
102     # print(fc3) 檢查點
103 
104     # tf.nn.softmax_cross_entropy_with_logits函數已經進行了softmax處理！沒必要再加一層softmax（發現這個錯誤後，訓練精度終於變得正常）
105     # sm = tf.nn.softmax(fc3)
106 
107     return fc3
108 
109 
110 # 注意下面的shape要和傳入的tensor一致！使用mnist數據集時x的shape爲[none, 28*28*1]，是由於傳入的數據是展開成行的
111 x = tf.placeholder(tf.float32, [None, 128, 128, 3])
112 y = tf.placeholder(tf.float32, [None, n_classes])
113 dropout = tf.placeholder(tf.float32)
114 
115 pred = VGGnet(x, weights, biases, dropout)
116 
117 # 定義損失函數和優化器
118 # 錯誤：Only call `softmax_cross_entropy_with_logits` with named arguments (labels=..., logits=...,)，解決方法：參數要以關鍵字參數的形式傳入
119 # tf.nn.softmax_cross_entropy_with_logits先是對最後一層輸出作一個softmax，而後求softmax向量裏每一個元素的這個值：y_i * log(yi)（y_i爲實際值，yi爲預測值），
120 # tf.reduce_mean對每一個元素上面的乘積求和再平均
121 # 參考：https://blog.csdn.net/mao_xiao_feng/article/details/53382790
122 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y))
123 optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(loss)
124 
125 # 評估函數
126 # tf.argmax()返回每一個向量最大元素的索引(axis=1)，tf.equal()返回兩個數是否相等（ture or false）
127 # https://blog.csdn.net/qq575379110/article/details/70538051/
128 # https://blog.csdn.net/uestc_c2_403/article/details/72232924
129 correct_pred = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
130 accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
131 
132 # init = tf.initialize_all_variables()
133 batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch)
134 
135 with tf.Session() as sess:
136     # sess.run(init)
137     # 先執行初始化工做
138     # 參考：https://blog.csdn.net/lujiandong1/article/details/53376802
139     sess.run(tf.global_variables_initializer())
140     sess.run(tf.local_variables_initializer())
141     # sess.run(tf.initialize_all_variables())
142 
143     # 開啓一個協調器
144     coord = tf.train.Coordinator()
145     # 使用start_queue_runners 啓動隊列填充
146     threads = tf.train.start_queue_runners(sess, coord)
147 
148     try:
149         step = 1
150         while not coord.should_stop():
151             # 獲取每個batch中batch_size個樣本和標籤
152             # 原來下面這一句放在這個位置（改變這一句的位置後卡了幾天的問及終於解決了）：
153             # batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch)
154             # 結果程序卡住，沒法運行，也不報錯
155             # 檢查點：print('kaka')
156 
157             # print(batch_x)
158             # print(batch_y)
159             # print('okok') 檢查點
160             # 沒有下面這句會報錯：
161             # The value of a feed cannot be a tf.Tensor object. Acceptable feed
162             # values include Python scalars, strings, lists, numpy ndarrays, or TensorHandles.
163             # 原覺得是要用tensor.eval()將tensor轉爲np.array，但batch_x, batch_y = convert_TFrecords.inputs(True, batch_size, epoch)
164             # 那時是放在sess裏面，因此執行到tensor.eval()時同樣會卡住不動
165             b_x, b_y = sess.run([batch_x, batch_y])
166             # print('haha') 檢查點
167             # 打印出tesor:默認值打印出3個參數  參考：https://blog.csdn.net/qq_34484472/article/details/75049179
168             # print(b_x, b_y) 檢查點
169             # 這裏原先喂入dict的tensor變量名不是b_x,b_y，而是和key名同樣（也就是x,y），變量名與佔位符名衝突，結果
170             # 會報錯：unhashable type: 'numpy.ndarray' error
171             # 這個錯誤也有多是其餘緣由引發，見：https://blog.csdn.net/wongleetion/article/details/80885648
172             start = time.time()
173             sess.run(optimizer, feed_dict={x: b_x, y: b_y, dropout: Dropout})
174             if step % display_step == 0:
175                 # 原來在feed_dict裏關鍵字dropout打錯成keep_prob了，結果彈出Cannot interpret feed_dict key
176                 # as Tensor：Can not convert a float into a Tensor錯誤
177                 # 參考https://blog.csdn.net/ice_pill/article/details/78567841
178                 Loss, acc = sess.run([loss, accuracy], feed_dict={x: b_x, y: b_y, dropout: 1.0})
179                 print('iter ' + str(step) + ', minibatch loss = ' +
180                       '{: .6f}'.format(Loss) + ', training accuracy = ' + '{: .5f}'.format(acc))
181                 # sess.run(tf.Print(b_y, [b_y], summarize=43))
182                 print(b_y)
183             print('iter %d, duration: %.2fs' % (step, time.time() - start))
184             step += 1
185     except tf.errors.OutOfRangeError:  # 若是讀取到文件隊列末尾會拋出此異常
186         print("done! now lets kill all the threads……")
187     finally:
188         # 協調器coord發出全部線程終止信號
189         coord.request_stop()
190         print('all threads are asked to stop!')
191     coord.join(threads)  # 把開啓的線程加入主線程，等待threads結束
192     print('all threads are stopped!')

convert_TFrecords.py（TFrecords文件的製做和讀取）:

  1 # -*- coding: utf-8 -*-
  2 
  3 import os
  4 import tensorflow as tf
  5 from PIL import Image
  6 
  7 cur_dir = os.getcwd()
  8 
  9 # classes = ['test_file_dir', 'train_file_dir']
 10 train_set = os.path.join(cur_dir, 'train_file_dir')
 11 classes = os.listdir(train_set)
 12 
 13 
 14 # 製做二進制數據
 15 def create_record():
 16     print('processing...')
 17     writer = tf.python_io.TFRecordWriter('train.tfrecords')
 18     num_labels = len([name for name in classes])
 19     print('num of classes: %d' % num_labels)
 20     label = [0] * num_labels
 21     for index, name in enumerate(classes):
 22         class_path = os.path.join(train_set, name)
 23         label[index] = 1
 24         for img_name in os.listdir(class_path):
 25             img_path = os.path.join(class_path, img_name)
 26             img = Image.open(img_path)
 27             # img = img.resize((64, 64))
 28             img_raw = img.tobytes()  # 將圖片轉化爲原生bytes
 29             # print(img_raw)
 30             # print(index,img_raw)
 31             # tfrecord數據文件是一種將圖像數據和標籤統一存儲的二進制文件，能更好的利用內存，在tensorflow中快速的複製，移動，讀取，存儲等。
 32             # tfrecord文件包含了tf.train.Example 協議緩衝區(protocol buffer，協議緩衝區包含了特徵 Features)。你能夠寫一段代碼獲取你的數據，
 33             # 將數據填入到Example協議緩衝區(protocol buffer)，將協議緩衝區序列化爲一個字符串， 而且經過tf.python_io.TFRecordWriter class寫
 34             # 入到TFRecords文件
 35             example = tf.train.Example(
 36                 # feature字典中每一個key的值都是一個list，這些list是3種數據類型中的一種：FloatList， 或者ByteList，或者Int64List
 37                 # 參考https://blog.csdn.net/u012759136/article/details/52232266
 38                 # 參考https://blog.csdn.net/shenxiaolu1984/article/details/52857437
 39                features=tf.train.Features(feature={
 40                     # 設置圖片在TFrecord文件中的標籤（同一文件夾下標籤一致），注意存儲的是一個大小爲num_label的list，而不是一個值！！
 41                     'label': tf.train.Feature(int64_list=tf.train.Int64List(value=label)), # label原本就是一個list，不用加中括號
 42                     # 設置圖片在TFrecord文件中的值
 43                     'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
 44                }))
 45             writer.write(example.SerializeToString())
 46         label = [0] * num_labels
 47     writer.close()
 48     print('TFrecords file created successfully!!')
 49 
 50 
 51 # 讀取二進制數據
 52 def read_and_decode(filename, num_epochs):
 53     # 根據文件名，順序生成一個隊列（若是shuffle=ture）
 54     filename_queue = tf.train.string_input_producer([filename], shuffle=True, num_epochs=num_epochs)
 55     print('qunide')
 56     reader = tf.TFRecordReader()
 57     _, serialized_example = reader.read(filename_queue)   # 返回文件名和文件
 58     features = tf.parse_single_example(serialized_example,
 59                                        features={
 60                                            # 這個函數不是很瞭解，原來在'label'裏的shape爲空（[]），結果彈出錯誤：Key: label, Index: 0.  Number
 61                                            # of int64 values != expected.  Values size: 43 but output shape: []
 62                                            # 注意數據類型要和TFrecords文件中一致！！
 63                                            'label': tf.FixedLenFeature([43], tf.int64),
 64                                            'img_raw': tf.FixedLenFeature([], tf.string),     ##########
 65                                        })
 66 
 67     img = features['img_raw']
 68     # decode_raw()函數只能用於解碼byteslist格式的數據
 69     img = tf.decode_raw(img, tf.uint8)
 70     img = tf.reshape(img, [128, 128, 3])
 71     img = tf.cast(img, tf.float32) * (1. / 255) - 0.5     # 規範化到±0.5之間
 72     label = features['label']
 73     # label = tf.reshape(label, [43])   ????不用這樣作，本來存儲的時候shape就是[43]
 74     label = tf.cast(label, tf.float32)    # 由於網絡輸出的pred值是float32類型的！！(?)
 75     print('label', label)
 76     print('image', img)
 77 
 78     return img, label
 79 
 80 
 81 def inputs(train, batch_size, num_epochs):
 82     print('qunide2')
 83     if not num_epochs:
 84         num_epochs = None
 85     filename = os.path.join(cur_dir, 'train.tfrecords' if train else 'test.tfrecords')  # 暫時先這樣
 86 
 87     with tf.name_scope('input'):
 88         image, label = read_and_decode(filename, num_epochs)
 89         # print(image) 檢查點
 90         # tf.train.shuffle_batch應該是從tf.train.string_input_producer生成的文件隊列中先打亂再從中抽取組成batch，因此
 91         # 這個打亂後的隊列容量和min_after_dequeue（應該是決定原有隊列被抽取後的最小樣本含量，決定被抽取後再填入的量）
 92         # 根據batch_size的不一樣會影響訓練精度（由於再填充並打亂後不少以前網絡沒見過的樣本會被送入，當全部訓練數據都過一遍後，精度會提升），這是個人我的猜想
 93         images, sparse_labels = tf.train.shuffle_batch([image, label], batch_size=batch_size,
 94                                                         num_threads=2, capacity=3000,  # 線程數通常與處理器核數同樣
 95                                                        # 但並非線程越多越快，甚至更多的線程反而會使效率降低
 96                                                        # 參考：https://blog.csdn.net/lujiandong1/article/details/53376802
 97                                                        # https://blog.csdn.net/heiheiya/article/details/80967301
 98                                                        min_after_dequeue=2000)
 99         # print(images) 檢查點
100         return images, sparse_labels
101     # 注意返回值的類型要與tf.placeholder()中的dtypes, shape都要相同！
102 
103 
104 if __name__ == '__main__':
105     create_record()