CIFAR-10 是由 Hinton 的學生 Alex Krizhevsky 和 Ilya Sutskever 整理的一個用於識別普適物體的小型數據集。一共包含 10 個類別的 RGB 彩色圖片 :飛機( airplane )、汽車( automobile )、鳥類( bird )、貓( cat )、鹿( deer )、 狗( dog )、蛙類( frog )、馬( horse )、船( ship )和卡車( truck )。圖片的尺寸爲 32 × 32 ,數據集中一共有 50000 張訓練圖片和 10000 張測試圖片。本文訓練過程可見官方示例:https://www.tensorflow.org/tutorials/images/deep_cnnhtml
下載腳本內容以下:node
# coding:utf-8 import tensorflow as tf from six.moves import urllib import os import sys import tarfile # tf.app.flags.FLAGS是TensorFlow內部的一個全局變量存儲器,同時能夠用於命令行參數的處理 FLAGS = tf.app.flags.FLAGS # 定義tf.app.flags.FLAGS.data_dir爲CIFAR-10的數據路徑 tf.app.flags.DEFINE_string('data_dir', '/tmp/cifar10_data', """Path to the CIFAR-10 data directory.""") # 咱們把這個路徑改成cifar10_data FLAGS.data_dir = 'cifar10_data/' DATA_URL = 'http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz' # 若是不存在數據文件,就會執行下載 def maybe_download_and_extract(): """Download and extract the tarball from Alex's website.""" dest_directory = FLAGS.data_dir if not os.path.exists(dest_directory): os.makedirs(dest_directory) filename = DATA_URL.split('/')[-1] filepath = os.path.join(dest_directory, filename) if not os.path.exists(filepath): def _progress(count, block_size, total_size): sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename,float(count * block_size) / float(total_size) * 100.0)) sys.stdout.flush() filepath, _ = urllib.request.urlretrieve(DATA_URL, filepath, _progress) print() statinfo = os.stat(filepath) print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') extracted_dir_path = os.path.join(dest_directory, 'cifar-10-batches-bin') if not os.path.exists(extracted_dir_path): tarfile.open(filepath, 'r:gz').extractall(dest_directory) if __name__=='__main__': maybe_download_and_extract()
txt文本文件中存儲了每一個類別的英文名稱,每一個bin文件有1w張圖像 python
TensorFlow程序讀取數據方式可查看官方中文文檔:http://tensorfly.cn/tfdoc/how_tos/reading_data.htmlgit
通常狀況是將數據讀入內存,再交由GPU或CPU進行運算。假設讀入用時0.1s ,計算用時 0.9s ,那麼就意昧着每過1s, GPU 都會有0.1s無事可作,這大大下降了運算的效率。web
解決方法: 將讀入數據和計算分別放在兩個線程中,將數據讀入內存的一個隊列api
讀取線程源源不斷地將文件系統中的圖片讀入一個內存的隊列中,而負責計算的是另外一個線程,計算須要數據肘,直接從內存隊列中取就能夠了 。這樣能夠解決 GPU 由於 I/O而空閒的問題!網絡
在機器學習中有個概念:epoch。一次epoch至關於將整個訓練集中的圖片計算一次,考慮到epoch的狀況,在內存隊列前添加了「文件名隊列」session
TensorFlow 使用「文件名隊列+內存隊列」雙隊列的形式讀入文件 ,能夠很好地管理 epoch 。架構
以A,B,C三張圖片,epoch=1爲例展現,內存隊列會從文件名隊列中取app
測試代碼以下:
# coding:utf-8 import os if not os.path.exists('read'): os.makedirs('read/') # 導入TensorFlow import tensorflow as tf # 新建一個Session with tf.Session() as sess: # 咱們要讀三幅圖片A.jpg, B.jpg, C.jpg filename = ['A.jpg', 'B.jpg', 'C.jpg'] # string_input_producer會產生一個文件名隊列 filename_queue = tf.train.string_input_producer(filename, shuffle=False, num_epochs=5) # reader從文件名隊列中讀數據。對應的方法是reader.read reader = tf.WholeFileReader() key, value = reader.read(filename_queue) # tf.train.string_input_producer定義了一個epoch變量,要對它進行初始化 tf.local_variables_initializer().run() # 使用start_queue_runners以後,纔會開始填充隊列 threads = tf.train.start_queue_runners(sess=sess) i = 0 while True: i += 1 # 獲取圖片數據並保存 image_data = sess.run(value) with open('read/test_%d.jpg' % i, 'wb') as f: f.write(image_data) # 程序最後會拋出一個OutOfRangeError,這是epoch跑完,隊列關閉的標誌
運行結果:
2018-10-30 16:28:09.015742: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA Traceback (most recent call last): File "test.py", line 26, in <module> image_data = sess.run(value) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 889, in run run_metadata_ptr) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1120, in _run feed_dict_tensor, options, run_metadata) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run [root@node5 chapter_02]# python test.py 2018-10-30 16:28:27.836579: I tensorflow/core/platform/cpu_feature_guard.cc:137] Your CPU supports ins tructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA Traceback (most recent call last): File "test.py", line 26, in <module> image_data = sess.run(value) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 889, in run run_metadata_ptr) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1120, in _run feed_dict_tensor, options, run_metadata) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1317, in _do_run options, run_metadata) File "/usr/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1336, in _do_call raise type(e)(node_def, op, message)tensorflow.python.framework.errors_impl.OutOfRangeError: FIFOQueue '_0_input_producer' is closed and h as insufficient elements (requested 1, current size 0) [[Node: ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/device:CPU:0"](WholeFileReaderV2, input_producer)]] Caused by op u'ReaderReadV2', defined at: File "test.py", line 17, in <module> key, value = reader.read(filename_queue) File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/io_ops.py", line 195, in read return gen_io_ops._reader_read_v2(self._reader_ref, queue_ref, name=name) File "/usr/lib/python2.7/site-packages/tensorflow/python/ops/gen_io_ops.py", line 673, in _reader_read_v2 queue_handle=queue_handle, name=name) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 787, in _apply_op_helper op_def=op_def) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2956, in create_op op_def=op_def) File "/usr/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1470, in __init__ self._traceback = self._graph._extract_stack() # pylint: disable=protected-access OutOfRangeError (see above for traceback): FIFOQueue '_0_input_producer' is closed and has insufficient elements (requested 1, current size 0) [[Node: ReaderReadV2 = ReaderReadV2[_device="/job:localhost/replica:0/task:0/device:CPU:0"](WholeFileReaderV2, input_producer)]]
一個樣本由 3073 個字節組成,第一個字節爲標籤( label ),剩下 3072 個字節爲圖像數據。樣本和樣本之間沒有多餘的字節分割,所以這幾個二進制文件的大小都是 30730000 字節 。
如何用 TensorFlow 讀取 CIFAR-10 數據呢?
#coding: utf-8 import tensorflow as tf import os import scipy.misc # 從queue中讀取文件 def read_cifar10(filename_queue): """Reads and parses examples from CIFAR10 data files. Recommendation: if you want N-way read parallelism, call this function N times. This will give you N independent Readers reading different files & positions within those files, which will give better mixing of examples. Args: filename_queue: A queue of strings with the filenames to read from. Returns: An object representing a single example, with the following fields: height: number of rows in the result (32) width: number of columns in the result (32) depth: number of color channels in the result (3) key: a scalar string Tensor describing the filename & record number for this example. label: an int32 Tensor with the label in the range 0..9. uint8image: a [height, width, depth] uint8 Tensor with the image data """ class CIFAR10Record(object): pass result = CIFAR10Record() label_bytes = 1 # 2 for CIFAR-100 result.height = 32 result.width = 32 result.depth = 3 image_bytes = result.height * result.width * result.depth # Every record consists of a label followed by the image, with a fixed number of bytes for each. record_bytes = label_bytes + image_bytes # Read a record, getting filenames from the filename_queue.
# No header or footer in the CIFAR-10 format, so we leave header_bytes and footer_bytes at their default of 0. reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) result.key, value = reader.read(filename_queue) # Convert from a string to a vector of uint8 that is record_bytes long. record_bytes = tf.decode_raw(value, tf.uint8) # The first bytes represent the label, which we convert from uint8->int32. result.label = tf.cast(tf.strided_slice(record_bytes, [0], [label_bytes]), tf.int32) # The remaining bytes after the label represent the image, which we reshape # from [depth * height * width] to [depth, height, width]. depth_major = tf.reshape(tf.strided_slice(record_bytes, [label_bytes],[label_bytes + image_bytes]), [result.depth, result.height, result.width]) # Convert from [depth, height, width] to [height, width, depth]. result.uint8image = tf.transpose(depth_major, [1, 2, 0]) return result def inputs_origin(data_dir): # filenames一共5個,從data_batch_1.bin到data_batch_5.bin # 讀入的都是訓練圖像 filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) for i in xrange(1, 6)] # 判斷文件是否存在 for f in filenames: if not tf.gfile.Exists(f): raise ValueError('Failed to find file: ' + f) # 將文件名的list包裝成TensorFlow中queue的形式 filename_queue = tf.train.string_input_producer(filenames) # 返回的結果read_input的屬性uint8image就是圖像的Tensor read_input = read_cifar10(filename_queue) # 將圖片轉換爲實數形式 reshaped_image = tf.cast(read_input.uint8image, tf.float32) # 返回的reshaped_image是一張圖片的tensor # 咱們應當這樣理解reshaped_image:每次使用sess.run(reshaped_image),就會取出一張圖片 return reshaped_image if __name__ == '__main__': # 建立一個會話sess, # 爲何不能用with tf.Session() as sess, 解答https://blog.csdn.net/chengqiuming/article/details/80293220 sess = tf.Session() # 調用inputs_origin。cifar10_data/cifar-10-batches-bin是咱們下載的數據的文件夾位置 reshaped_image = inputs_origin('cifar10_data/cifar-10-batches-bin') # 這一步start_queue_runner很重要。 # 咱們以前有filename_queue = tf.train.string_input_producer(filenames) # 這個queue必須經過start_queue_runners才能啓動 缺乏start_queue_runners程序將不能執行 threads = tf.train.start_queue_runners(sess=sess) # 變量初始化 sess.run(tf.global_variables_initializer()) # 建立文件夾cifar10_data/raw/ if not os.path.exists('cifar10_data/raw/'): os.makedirs('cifar10_data/raw/') # 保存30張圖片 for i in range(30): # 每次sess.run(reshaped_image),都會取出一張圖片 image_array = sess.run(reshaped_image) # 將圖片保存 scipy.misc.toimage(image_array).save('cifar10_data/raw/%d.jpg' % i)
對於圖像類型的訓練、數據,所謂的數據加強( Data Augmentation )方法是指利用平移 、 縮放、顏色等變躁,人工增大訓練集樣本的個數,從而得到更充足的訓練數據,使模型訓練的效果更好 。
經常使用的圖像數據加強的方法以下。
使用數據加強方法的前提是,這些數據加強方法不會改變圖像的原有標籤。
# 隨機剪裁圖片,從32*32到24*24 distorted_image = tf.random_crop(reshaped_image, [height, width, 3]) # 隨機翻轉圖片,每張圖片有50%的機率被水平左右翻轉,另有50%的機率保持不變 distorted_image = tf.image.random_flip_left_right(distorted_image) # 隨機改變亮度和對比度 distorted_image = tf.image.random_brightness(distorted_image, max_delta=63) distorted_image = tf.image.random_contrast(distorted_image,lower=0.2, upper=1.8)
原始的訓練圖片是 reshaped_image。最後會獲得一個數據加強後的訓練樣本 distorted_image 。訓練時,直接使用 distorted_image 進行訓練便可。
代碼邏輯以下:
該文件中包含三個和訓練過程相關的函數: read_cifar10, _generate_image_and_label_batch, distorted_inputs三個函數,下面依次來看函數的實現
#encoding=utf-8 from __future__ import absolute_import from __future__ import division from __future__ import print_function import os from six.moves import xrange # pylint: disable=redefined-builtin import tensorflow as tf # 注意此處不是原圖的size 32*32,由於後續會作剪裁,若是修改了此值,整個模型架構會被改變須要從新訓練整個模型 IMAGE_SIZE = 24 # 全局常量 NUM_CLASSES = 10 NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 50000 NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = 10000
從文件名隊列中取圖片,一次運行取到一張
def read_cifar10(filename_queue): ''' 從文件名隊列中按字節讀取圖像數據 返回值:一個對象 height,width,depth,key(filename),label(an int32 Tensor),uint8image(a [height, width, depth] uint8 Tensor with the image data) 建議:if you want N-way read parallelism, call this function N times. This will give you N independent Readers reading different files & positions within those files, which will give better mixing of examples. ''' class CIFAR10Record(object): pass result = CIFAR10Record() # CIFAR-10數據集中圖片的各維. 詳情見 http://www.cs.toronto.edu/~kriz/cifar.html label_bytes = 1 # 2 for CIFAR-100 result.height = 32 result.width = 32 result.depth = 3 image_bytes = result.height * result.width * result.depth # 每條記錄的構成是<label><image> record_bytes = label_bytes + image_bytes # 讀取固定字節的內容,key是文件名,value中包含label和image reader = tf.FixedLengthRecordReader(record_bytes=record_bytes) result.key, value = reader.read(filename_queue) # 編碼轉換 Convert from a string to a vector of uint8 that is record_bytes long. record_bytes = tf.decode_raw(value, tf.uint8) # 第一個/第二個字節表示label, 並作轉換 uint8->int32. result.label = tf.cast(tf.strided_slice(record_bytes, [0], [label_bytes]), tf.int32) # 標籤字節後面是圖像相關字節[depth * height * width]重塑成[depth, height, width]. depth_major = tf.reshape(tf.strided_slice(record_bytes, [label_bytes], [label_bytes + image_bytes]), [result.depth, result.height, result.width]) # 轉置 Convert from [depth, height, width] to [height, width, depth]. result.uint8image = tf.transpose(depth_major, [1, 2, 0]) return result
涉及不熟悉的tf操做:
生成批次的訓練數據
def _generate_image_and_label_batch(image, label, min_queue_examples, batch_size, shuffle): """ 生成一個batch的數據 Args: image: 3-D Tensor of [height, width, 3] of type.float32. label: 1-D Tensor of type.int32 min_queue_examples: int32, minimum number of samples to retain in the queue that provides of batches of examples. batch_size: 每批次數據數目 shuffle: 是否打亂 Returns: images: Images. 4D tensor of [batch_size, height, width, 3] size. labels: Labels. 1D tensor of [batch_size] size. """ # Create a queue that shuffles the examples, and then read 'batch_size' images + labels from the example queue. num_preprocess_threads = 16 if shuffle: images, label_batch = tf.train.shuffle_batch( [image, label], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size, min_after_dequeue=min_queue_examples) else: images, label_batch = tf.train.batch( [image, label], batch_size=batch_size, num_threads=num_preprocess_threads, capacity=min_queue_examples + 3 * batch_size) # Display the training images in the visualizer. tf.summary.image('images', images) return images, tf.reshape(label_batch, [batch_size])
涉及不熟悉的tf操做:
效果以下:
利用上面兩個函數生成要訓練的數據
def distorted_inputs(data_dir, batch_size): ''' 調用read_cifar10讀取圖片並作數據加強,繼而調用_generate_image_and_label_batch產生一個batch的數據 返回值: images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size. labels: Labels. 1D tensor of [batch_size] size. ''' filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i) for i in xrange(1, 6)] for f in filenames: if not tf.gfile.Exists(f): raise ValueError('Failed to find file: ' + f) # 文件名隊列 filename_queue = tf.train.string_input_producer(filenames) # 從文件名隊列中讀取圖片 read_input = read_cifar10(filename_queue) reshaped_image = tf.cast(read_input.uint8image, tf.float32) height = IMAGE_SIZE width = IMAGE_SIZE # 數據加強 distorted_image = tf.random_crop(reshaped_image, [height, width, 3]) distorted_image = tf.image.random_flip_left_right(distorted_image) distorted_image = tf.image.random_brightness(distorted_image, max_delta=63) distorted_image = tf.image.random_contrast(distorted_image, lower=0.2, upper=1.8) # Subtract off the mean and divide by the variance of the pixels. float_image = tf.image.per_image_standardization(distorted_image) # Set the shapes of tensors. float_image.set_shape([height, width, 3]) read_input.label.set_shape([1]) # Ensure that the random shuffling has good mixing properties. min_fraction_of_examples_in_queue = 0.4 min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN * min_fraction_of_examples_in_queue) print('Filling queue with %d CIFAR images before starting to train. % min_queue_examples) # Generate a batch of images and labels by building up a queue of examples. return _generate_image_and_label_batch(float_image, read_input.label, min_queue_examples, batch_size, shuffle=True)
知道cifar10.py是真正的訓練網絡實現文件,先來看cifar10_train.py的調用,再進而學習每一個步驟是如何實現的。
完整代碼:
from __future__ import absolute_import from __future__ import division from __future__ import print_function from datetime import datetime import time import tensorflow as tf import cifar10 # tf.app.flags.FLAGS 是 TensorFlow 內部的一個全局變量存儲器,同時能夠用於命令行參數的處理 FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('train_dir', '/tmp/cifar10_train', "Directory where to write event logs and checkpoint.") tf.app.flags.DEFINE_integer('max_steps', 100000, "Number of batches to run.") tf.app.flags.DEFINE_boolean('log_device_placement', False, "Whether to log device placement.") tf.app.flags.DEFINE_integer('log_frequency', 100, "How often to log results to the console.") def train(): """Train CIFAR-10 for a number of steps.""" with tf.Graph().as_default(): global_step = tf.contrib.framework.get_or_create_global_step() # Get images and labels for CIFAR-10. images, labels = cifar10.distorted_inputs() # Build a Graph that computes the logits predictions from the inference model. logits = cifar10.inference(images) # Calculate loss. loss = cifar10.loss(logits, labels) # Build a Graph that trains the model with one batch of examples and updates the model parameters. train_op = cifar10.train(loss, global_step) class _LoggerHook(tf.train.SessionRunHook): """記錄損失loss和運行時間""" def begin(self): self._step = -1 self._start_time = time.time() def before_run(self, run_context): self._step += 1 return tf.train.SessionRunArgs(loss) # Asks for loss value. def after_run(self, run_context, run_values): if self._step % FLAGS.log_frequency == 0: current_time = time.time() duration = current_time - self._start_time self._start_time = current_time loss_value = run_values.results examples_per_sec = FLAGS.log_frequency * FLAGS.batch_size / duration sec_per_batch = float(duration / FLAGS.log_frequency) format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f sec/batch)') print(format_str % (datetime.now(), self._step, loss_value, examples_per_sec, sec_per_batch)) with tf.train.MonitoredTrainingSession( checkpoint_dir=FLAGS.train_dir, hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps), tf.train.NanTensorHook(loss), _LoggerHook()], config=tf.ConfigProto(log_device_placement=FLAGS.log_device_placement)) as mon_sess: while not mon_sess.should_stop(): mon_sess.run(train_op) def main(argv=None): # pylint: disable=unused-argument cifar10.maybe_download_and_extract() if tf.gfile.Exists(FLAGS.train_dir): tf.gfile.DeleteRecursively(FLAGS.train_dir) tf.gfile.MakeDirs(FLAGS.train_dir) train() if __name__ == '__main__': tf.app.run()
該文件是關鍵,他實現了整個網絡架構
#encoding=utf-8 # pylint: disable=missing-docstring from __future__ import absolute_import from __future__ import division from __future__ import print_function import os import re import sys import tarfile from six.moves import urllib import tensorflow as tf import cifar10_input FLAGS = tf.app.flags.FLAGS # Basic model parameters. tf.app.flags.DEFINE_integer('batch_size', 128, "Number of images to process in a batch.") tf.app.flags.DEFINE_string('data_dir', '/tmp/cifar10_data', "Path to the CIFAR-10 data directory.") tf.app.flags.DEFINE_boolean('use_fp16', False, "Train the model using fp16.") # Global constants describing the CIFAR-10 data set. IMAGE_SIZE = cifar10_input.IMAGE_SIZE NUM_CLASSES = cifar10_input.NUM_CLASSES NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = cifar10_input.NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = cifar10_input.NUM_EXAMPLES_PER_EPOCH_FOR_EVAL # Constants describing the training process. MOVING_AVERAGE_DECAY = 0.9999 # The decay to use for the moving average. NUM_EPOCHS_PER_DECAY = 350.0 # Epochs after which learning rate decays. LEARNING_RATE_DECAY_FACTOR = 0.1 # Learning rate decay factor. INITIAL_LEARNING_RATE = 0.1 # Initial learning rate. # If a model is trained with multiple GPUs, prefix all Op names with tower_name # to differentiate the operations. Note that this prefix is removed from the # names of the summaries when visualizing a model. TOWER_NAME = 'tower' DATA_URL = 'http://www.cs.toronto.edu/~kriz/cifar-10-binary.tar.gz'
一些輔助函數:
def _activation_summary(x): """Helper to create summaries for activations. Creates a summary that provides a histogram of activations. Creates a summary that measures the sparsity of activations. Args: x: Tensor """ # Remove 'tower_[0-9]/' from the name in case this is a multi-GPU training session. # This helps the clarity of presentation on tensorboard. tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name) tf.summary.histogram(tensor_name + '/activations', x) tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))
def _variable_on_cpu(name, shape, initializer): """Helper to create a Variable stored on CPU memory. Args: name: name of the variable shape: list of ints initializer: initializer for Variable Returns: Variable Tensor """ with tf.device('/cpu:0'): dtype = tf.float16 if FLAGS.use_fp16 else tf.float32 var = tf.get_variable(name, shape, initializer=initializer, dtype=dtype) return var
def _variable_with_weight_decay(name, shape, stddev, wd): """Helper to create an initialized Variable with weight decay. Note that the Variable is initialized with a truncated normal distribution. A weight decay is added only if one is specified. Args: name: name of the variable shape: list of ints stddev: standard deviation of a truncated Gaussian wd: add L2Loss weight decay multiplied by this float. If None, weight decay is not added for this Variable. Returns: Variable Tensor """ dtype = tf.float16 if FLAGS.use_fp16 else tf.float32 var = _variable_on_cpu(name,shape, tf.truncated_normal_initializer(stddev=stddev, dtype=dtype)) if wd is not None: weight_decay = tf.multiply(tf.nn.l2_loss(var), wd, name='weight_loss') tf.add_to_collection('losses', weight_decay) return var
def maybe_download_and_extract(): """Download and extract the tarball from Alex's website.""" dest_directory = FLAGS.data_dir if not os.path.exists(dest_directory): os.makedirs(dest_directory) filename = DATA_URL.split('/')[-1] filepath = os.path.join(dest_directory, filename) if not os.path.exists(filepath): def _progress(count, block_size, total_size): sys.stdout.write('\r>> Downloading %s %.1f%%' % (filename, float(count * block_size) / float(total_size) * 100.0)) sys.stdout.flush() filepath, _ = urllib.request.urlretrieve(DATA_URL, filepath, _progress) print() statinfo = os.stat(filepath) print('Successfully downloaded', filename, statinfo.st_size, 'bytes.') extracted_dir_path = os.path.join(dest_directory, 'cifar-10-batches-bin') if not os.path.exists(extracted_dir_path): tarfile.open(filepath, 'r:gz').extractall(dest_directory)
把cifar10_input.py中distorted_inputs函數添加了一層,根據配置參數use_fp16決定是否採用float16的數據類型進行計算
def distorted_inputs(): """Construct distorted input for CIFAR training using the Reader ops. Returns: images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size. labels: Labels. 1D tensor of [batch_size] size. Raises: ValueError: If no data_dir """ if not FLAGS.data_dir: raise ValueError('Please supply a data_dir') data_dir = os.path.join(FLAGS.data_dir, 'cifar-10-batches-bin') images, labels = cifar10_input.distorted_inputs(data_dir=data_dir, batch_size=FLAGS.batch_size) if FLAGS.use_fp16: images = tf.cast(images, tf.float16) labels = tf.cast(labels, tf.float16) return images, labels
def inference(images): """Build the CIFAR-10 model. Args: images: Images returned from distorted_inputs() or inputs(). Returns: Logits. """ # We instantiate all variables using tf.get_variable() instead of tf.Variable() in order to share variables across multiple GPU training runs. # If we only ran this model on a single GPU, we could simplify this function by replacing all instances of tf.get_variable() with tf.Variable(). # 卷積層 with tf.variable_scope('conv1') as scope: kernel = _variable_with_weight_decay('weights', shape=[5, 5, 3, 64], stddev=5e-2, wd=0.0) conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME') biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0)) pre_activation = tf.nn.bias_add(conv, biases) conv1 = tf.nn.relu(pre_activation, name=scope.name) _activation_summary(conv1) pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool1') # 這是局部響應歸一化層(LRN),如今的模型大多不採用 norm1 = tf.nn.lrn(pool1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75, name='norm1') with tf.variable_scope('conv2') as scope: kernel = _variable_with_weight_decay('weights', shape=[5, 5, 64, 64], stddev=5e-2, wd=0.0) conv = tf.nn.conv2d(norm1, kernel, [1, 1, 1, 1], padding='SAME') biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.1)) pre_activation = tf.nn.bias_add(conv, biases) conv2 = tf.nn.relu(pre_activation, name=scope.name) _activation_summary(conv2) norm2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75, name='norm2') pool2 = tf.nn.max_pool(norm2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1], padding='SAME', name='pool2') # 全鏈接層 with tf.variable_scope('local3') as scope: # 後面再也不作卷積了,因此把pool2進行reshape,方便作全鏈接 reshape = tf.reshape(pool2, [FLAGS.batch_size, -1]) dim = reshape.get_shape()[1].value weights = _variable_with_weight_decay('weights', shape=[dim, 384], stddev=0.04, wd=0.004) biases = _variable_on_cpu('biases', [384], tf.constant_initializer(0.1)) local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name) _activation_summary(local3) with tf.variable_scope('local4') as scope: weights = _variable_with_weight_decay('weights', shape=[384, 192], stddev=0.04, wd=0.004) biases = _variable_on_cpu('biases', [192], tf.constant_initializer(0.1)) local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name=scope.name) _activation_summary(local4) # 這裏不顯示i進行softmax變換,只輸出變換前的Logit(即變量softmax_linear) # tf.nn.sparse_softmax_cross_entropy_with_logits accepts the unscaled logits and performs the softmax internally for efficiency. with tf.variable_scope('softmax_linear') as scope: weights = _variable_with_weight_decay('weights', [192, NUM_CLASSES], stddev=1/192.0, wd=0.0) biases = _variable_on_cpu('biases', [NUM_CLASSES], tf.constant_initializer(0.0)) softmax_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name) _activation_summary(softmax_linear) return softmax_linear
兩層卷積,三層全鏈接
def loss(logits, labels): """Add L2Loss to all the trainable variables. Add summary for "Loss" and "Loss/avg". Args: logits: Logits from inference(). labels: Labels from distorted_inputs or inputs(). 1-D tensor of shape [batch_size] Returns: Loss tensor of type float. """ # Calculate the average cross entropy loss across the batch. labels = tf.cast(labels, tf.int64) cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits( labels=labels, logits=logits, name='cross_entropy_per_example') cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy') tf.add_to_collection('losses', cross_entropy_mean) # The total loss is defined as the cross entropy loss plus all of the weight decay terms (L2 loss). return tf.add_n(tf.get_collection('losses'), name='total_loss')
def _add_loss_summaries(total_loss): """Add summaries for losses in CIFAR-10 model. Generates moving average for all losses and associated summaries for visualizing the performance of the network. Args: total_loss: Total loss from loss(). Returns: loss_averages_op: op for generating moving averages of losses. """ # Compute the moving average of all individual losses and the total loss. loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg') losses = tf.get_collection('losses') loss_averages_op = loss_averages.apply(losses + [total_loss]) # Attach a scalar summary to all individual losses and the total loss; do the same for the averaged version of the losses. for l in losses + [total_loss]: # Name each loss as '(raw)' and name the moving average version of the loss as the original loss name. tf.summary.scalar(l.op.name + ' (raw)', l) tf.summary.scalar(l.op.name, loss_averages.average(l)) return loss_averages_op
def train(total_loss, global_step): """Train CIFAR-10 model. Create an optimizer and apply to all trainable variables. Add moving average for all trainable variables. Args: total_loss: Total loss from loss(). global_step: Integer Variable counting the number of training steps processed. Returns: train_op: op for training. """ # Variables that affect learning rate. num_batches_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN / FLAGS.batch_size decay_steps = int(num_batches_per_epoch * NUM_EPOCHS_PER_DECAY) # Decay the learning rate exponentially based on the number of steps. lr = tf.train.exponential_decay(INITIAL_LEARNING_RATE, global_step, decay_steps, LEARNING_RATE_DECAY_FACTOR, staircase=True) tf.summary.scalar('learning_rate', lr) # Generate moving averages of all losses and associated summaries. loss_averages_op = _add_loss_summaries(total_loss) # Compute gradients. with tf.control_dependencies([loss_averages_op]): opt = tf.train.GradientDescentOptimizer(lr) grads = opt.compute_gradients(total_loss) # Apply gradients. apply_gradient_op = opt.apply_gradients(grads, global_step=global_step) # Add histograms for trainable variables. for var in tf.trainable_variables(): tf.summary.histogram(var.op.name, var) # Add histograms for gradients. for grad, var in grads: if grad is not None: tf.summary.histogram(var.op.name + '/gradients', grad) # Track the moving averages of all trainable variables. variable_averages = tf.train.ExponentialMovingAverage(MOVING_AVERAGE_DECAY, global_step) variables_averages_op = variable_averages.apply(tf.trainable_variables()) with tf.control_dependencies([apply_gradient_op, variables_averages_op]): train_op = tf.no_op(name='train') return train_op
以上是訓練過程代碼的學習,執行python cifar10_train.py --train_dir cifar10_train/ --data_dir cifar10_data/便可運行,運行tensorboard --logdir cifar10_train/便可在tensorboard中查看訓練進度
我是100K steps (256 epochs of data) 訓練的,差很少花了3.5h
from __future__ import absolute_import from __future__ import division from __future__ import print_function from datetime import datetime import math import time import numpy as np import tensorflow as tf import cifar10 FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_string('eval_dir', '/tmp/cifar10_eval', "Directory where to write event logs.") tf.app.flags.DEFINE_string('eval_data', 'test', "Either 'test' or 'train_eval'.") tf.app.flags.DEFINE_string('checkpoint_dir', '/tmp/cifar10_train', "Directory where to read model checkpoints.") tf.app.flags.DEFINE_integer('eval_interval_secs', 60 * 5, "How often to run the eval.") tf.app.flags.DEFINE_integer('num_examples', 10000, "Number of examples to run.") tf.app.flags.DEFINE_boolean('run_once', False, "Whether to run eval only once.") def eval_once(saver, summary_writer, top_k_op, summary_op): """Run Eval once. Args: saver: Saver. summary_writer: Summary writer. top_k_op: Top K op. summary_op: Summary op. """ with tf.Session() as sess: ckpt = tf.train.get_checkpoint_state(FLAGS.checkpoint_dir) if ckpt and ckpt.model_checkpoint_path: # Restores from checkpoint saver.restore(sess, ckpt.model_checkpoint_path) # Assuming model_checkpoint_path looks something like: /my-favorite-path/cifar10_train/model.ckpt-0 #extract global_step from it. global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1] else: print('No checkpoint file found') return # Start the queue runners. coord = tf.train.Coordinator() try: threads = [] for qr in tf.get_collection(tf.GraphKeys.QUEUE_RUNNERS): threads.extend(qr.create_threads(sess, coord=coord, daemon=True, start=True)) num_iter = int(math.ceil(FLAGS.num_examples / FLAGS.batch_size)) true_count = 0 # Counts the number of correct predictions. total_sample_count = num_iter * FLAGS.batch_size step = 0 while step < num_iter and not coord.should_stop(): predictions = sess.run([top_k_op]) true_count += np.sum(predictions) step += 1 # Compute precision @ 1. precision = true_count / total_sample_count print('%s: precision @ 1 = %.3f' % (datetime.now(), precision)) summary = tf.Summary() summary.ParseFromString(sess.run(summary_op)) summary.value.add(tag='Precision @ 1', simple_value=precision) summary_writer.add_summary(summary, global_step) except Exception as e: # pylint: disable=broad-except coord.request_stop(e) coord.request_stop() coord.join(threads, stop_grace_period_secs=10) def evaluate(): """Eval CIFAR-10 for a number of steps.""" with tf.Graph().as_default() as g: # Get images and labels for CIFAR-10. eval_data = FLAGS.eval_data == 'test' images, labels = cifar10.inputs(eval_data=eval_data) # Build a Graph that computes the logits predictions from the # inference model. logits = cifar10.inference(images) # Calculate predictions. top_k_op = tf.nn.in_top_k(logits, labels, 1) # Restore the moving average version of the learned variables for eval. variable_averages = tf.train.ExponentialMovingAverage(cifar10.MOVING_AVERAGE_DECAY) variables_to_restore = variable_averages.variables_to_restore() saver = tf.train.Saver(variables_to_restore) # Build the summary operation based on the TF collection of Summaries. summary_op = tf.summary.merge_all() summary_writer = tf.summary.FileWriter(FLAGS.eval_dir, g) while True: eval_once(saver, summary_writer, top_k_op, summary_op) if FLAGS.run_once: break time.sleep(FLAGS.eval_interval_secs) def main(argv=None): # pylint: disable=unused-argument cifar10.maybe_download_and_extract() if tf.gfile.Exists(FLAGS.eval_dir): tf.gfile.DeleteRecursively(FLAGS.eval_dir) tf.gfile.MakeDirs(FLAGS.eval_dir) evaluate() if __name__ == '__main__': tf.app.run()
運行命令:python cifar10_eval.py --data_dir cifar10_data/ --eval_dir cifar10_eval/ --checkpoint_dir cifar10_train/
能夠經過tensorboard看:tensorboard --logdir cifar10_eval/ --port 6007
爲何測試的時候要再開一個tensorboard,能夠根據步數觀察測試效果。訓練和測試同時進行,測試會去讀取模型文件中最新的模型,實際上到 6 萬步左右時,模型就有了 86%的準確率,到10萬步時的準確率爲 86.3%,到15萬步後的準確率基本穩定在 86.6%左右。
暫緩。。。。。。先把第三章的訓練先學了,工做須要!!!再學習下tf操做的summary,衰減梯度降低部分函數。。。。。sad