深度學習tensorflow實戰筆記（2）圖像轉換成tfrecords和讀取

時間 2019-11-10

標籤深度學習 tensorflow 實戰筆記圖像轉換 tfrecords 讀取简体版

原文原文鏈接

一、準備數據

首選將本身的圖像數據分類分別放在不一樣的文件夾下，好比新建data文件夾，data文件夾下分別存放up和low文件夾，up和low文件夾下存放對應的圖像數據。也能夠把up和low文件夾換成0和1。根據本身數據類別，本身設定。如圖所示python

以上三張圖片注意看目錄。這樣數據就準備好了。session

二、將圖像數據轉換成tfrecords

直接上代碼，代碼中比較重要的部分我都作了註釋。

 1 import os
 2 import tensorflow as tf 
 3 from PIL import Image
 4 import matplotlib.pyplot as plt
 5 import numpy as np
 6  
 7 sess=tf.InteractiveSession()
 8 cwd = "D://software//tensorflow//data//"  #數據所在目錄位置
 9 classes = {'up', 'low'} #預先本身定義的類別，根據本身的須要修改
10 writer = tf.python_io.TFRecordWriter("train.tfrecords")  #train表示轉成的tfrecords數據格式的名字
11  
12 for index, name in enumerate(classes):
13     class_path = cwd + name + "/"
14     for img_name in os.listdir(class_path):
15         img_path = class_path + img_name
16         img = Image.open(img_path)
17         img = img.resize((300, 300))  #圖像reshape大小設置，根據本身的須要修改
18         img_raw = img.tobytes()              
19         example = tf.train.Example(features=tf.train.Features(feature={
20             "label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
21             'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
22         }))
23         writer.write(example.SerializeToString()) 
24 writer.close()

三、從tfrecords中讀取數據

直接上代碼：函數

 1 #讀取文件
 2 def read_and_decode(filename,batch_size):
 3     #根據文件名生成一個隊列
 4     filename_queue = tf.train.string_input_producer([filename])
 5     reader = tf.TFRecordReader()
 6     _, serialized_example = reader.read(filename_queue)   #返回文件名和文件
 7     features = tf.parse_single_example(serialized_example,
 8                                        features={
 9                                            'label': tf.FixedLenFeature([], tf.int64),
10                                            'img_raw' : tf.FixedLenFeature([], tf.string),
11                                        })
12  
13     img = tf.decode_raw(features['img_raw'], tf.uint8)
14     img = tf.reshape(img, [300, 300, 3])                #圖像歸一化大小
15    # img = tf.cast(img, tf.float32) * (1. / 255) - 0.5   #圖像減去均值處理，根據本身的須要決定要不要加上
16     label = tf.cast(features['label'], tf.int32)        
17  
18     #特殊處理，去數據的batch，若是不要對數據作batch處理，也能夠把下面這部分不放在函數裏
19  
20     img_batch, label_batch = tf.train.shuffle_batch([img, label],
21                                                     batch_size= batch_size,
22                                                     num_threads=64,
23                                                     capacity=200,
24                                                     min_after_dequeue=150)
25     return img_batch, tf.reshape(label_batch,[batch_size])

須要注意的地方：ui

img = tf.cast(img, tf.float32) * (1. / 255) - 0.5   #圖像減去均值處理，根據本身的須要決定要不要加上

1 #特殊處理，去數據的batch，若是不要對數據作batch處理，也能夠把下面這部分不放在函數裏
2     img_batch, label_batch = tf.train.shuffle_batch([img, label],
3                                                     batch_size= batch_size,
4                                                     num_threads=64,
5                                                     capacity=200,
6                                                     min_after_dequeue=150)

若是不須要把數據作batch處理，則函數的第二個形參batch_size就去掉，函數直接返回img和label。也能夠把batch處理部分放在函數外面，根據本身的須要本身修改一下。spa

四、轉換和讀取函數的調用

1 tfrecords_file = 'train.tfrecords'   #要讀取的tfrecords文件
2 BATCH_SIZE = 4      #batch_size的大小
3 image_batch, label_batch = read_and_decode(tfrecords_file,BATCH_SIZE)  
4 print(image_batch,label_batch)    #注意，這裏不是tensor，tensor須要作see.run()處理

下面就定義session，執行便可，有一個地方須要注意，code

image_batch, label_batch = read_and_decode(tfrecords_file,BATCH_SIZE)   #須要注意

雖然可以把數據讀取出來，可是不是tensor，在訓練的時候須要image,label=sess.run([image_batch,label_batch])處理後，才能投入訓練。具體細節下一篇博客再作詳細介紹。blog

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。