不要慫，就是GAN (生成式對抗網絡) （二）：數據讀取和操做

時間 2019-11-11

原文原文鏈接

前面咱們瞭解了 GAN 的原理，下面咱們就來用 TensorFlow 搭建 GAN（嚴格說來是 DCGAN，如無特別說明，本系列文章所說的 GAN 均指 DCGAN），如前面所說，GAN 分爲有約束條件的 GAN，和不加約束條件的GAN，咱們先來搭建一個簡單的 MNIST 數據集上加約束條件的 GAN。html

首先下載數據：在 /home/your_name/TensorFlow/DCGAN/ 下創建文件夾 data/mnist，從 http://yann.lecun.com/exdb/mnist/ 網站上下載 mnist 數據集 train-images-idx3-ubyte.gz，train-labels-idx1-ubyte.gz，t10k-images-idx3-ubyte.gz，t10k-labels-idx1-ubyte.gz 到 mnist 文件夾下獲得四個 .gz 文件。python

數據下載好以後，在 /home/your_name/TensorFlow/DCGAN/ 下新建文件 read_data.py 讀取數據，輸入以下代碼：git

import os import numpy as np def read_data(): # 數據目錄
    data_dir = '/home/your_name/TensorFlow/DCGAN/data/mnist'
    
    # 打開訓練數據 
    fd = open(os.path.join(data_dir,'train-images-idx3-ubyte')) # 轉化成 numpy 數組
    loaded = np.fromfile(file=fd,dtype=np.uint8) # 根據 mnist 官網描述的數據格式，圖像像素從 16 字節開始
    trX = loaded[16:].reshape((60000,28,28,1)).astype(np.float) # 訓練 label
    fd = open(os.path.join(data_dir,'train-labels-idx1-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) trY = loaded[8:].reshape((60000)).astype(np.float) # 測試數據
    fd = open(os.path.join(data_dir,'t10k-images-idx3-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) teX = loaded[16:].reshape((10000,28,28,1)).astype(np.float) # 測試 label
    fd = open(os.path.join(data_dir,'t10k-labels-idx1-ubyte')) loaded = np.fromfile(file=fd,dtype=np.uint8) teY = loaded[8:].reshape((10000)).astype(np.float) trY = np.asarray(trY) teY = np.asarray(teY) # 因爲生成網絡由服從某一分佈的噪聲生成圖片，不須要測試集，
    # 因此把訓練和測試兩部分數據合併
    X = np.concatenate((trX, teX), axis=0) y = np.concatenate((trY, teY), axis=0) # 打亂排序
    seed = 547 np.random.seed(seed) np.random.shuffle(X) np.random.seed(seed) np.random.shuffle(y) # 這裏，y_vec 表示對網絡所加的約束條件，這個條件是類別標籤，
    # 能夠看到，y_vec 實際就是對 y 的獨熱編碼，關於什麼是獨熱編碼，
    # 請參考 http://www.cnblogs.com/Charles-Wan/p/6207039.html
    y_vec = np.zeros((len(y), 10), dtype=np.float) for i, label in enumerate(y): y_vec[i,y[i]] = 1.0
    
    return X/255., y_vec

這裏順便說明一下，因爲 MNIST 數據整體佔得內存不大（能夠看下載的文件，最大的一個 45M 左右，）因此這樣讀取數據是容許的，通常狀況下，數據特別龐大的時候，建議把數據轉化成 tfrecords，用 TensorFlow 標準的數據讀取格式，這樣能帶來比較高的效率。github

而後，定義一些基本的操做層，例如卷積，池化，全鏈接等層，在 /home/your_name/TensorFlow/DCGAN/ 新建文件 ops.py，輸入以下代碼：數組

import tensorflow as tf from tensorflow.contrib.layers.python.layers import batch_norm as batch_norm # 常數偏置
def bias(name, shape, bias_start = 0.0, trainable = True): dtype = tf.float32 var = tf.get_variable(name, shape, tf.float32, trainable = trainable, initializer = tf.constant_initializer( bias_start, dtype = dtype)) return var # 隨機權重
def weight(name, shape, stddev = 0.02, trainable = True): dtype = tf.float32 var = tf.get_variable(name, shape, tf.float32, trainable = trainable, initializer = tf.random_normal_initializer( stddev = stddev, dtype = dtype)) return var # 全鏈接層
def fully_connected(value, output_shape, name = 'fully_connected', with_w = False): shape = value.get_shape().as_list() with tf.variable_scope(name): weights = weight('weights', [shape[1], output_shape], 0.02) biases = bias('biases', [output_shape], 0.0) if with_w: return tf.matmul(value, weights) + biases, weights, biases else: return tf.matmul(value, weights) + biases # Leaky-ReLu 層
def lrelu(x, leak=0.2, name = 'lrelu'): with tf.variable_scope(name): return tf.maximum(x, leak*x, name = name) # ReLu 層
def relu(value, name = 'relu'): with tf.variable_scope(name): return tf.nn.relu(value) # 解卷積層
def deconv2d(value, output_shape, k_h = 5, k_w = 5, strides =[1, 2, 2, 1], name = 'deconv2d', with_w = False): with tf.variable_scope(name): weights = weight('weights', [k_h, k_w, output_shape[-1], value.get_shape()[-1]]) deconv = tf.nn.conv2d_transpose(value, weights, output_shape, strides = strides) biases = bias('biases', [output_shape[-1]]) deconv = tf.reshape(tf.nn.bias_add(deconv, biases), deconv.get_shape()) if with_w: return deconv, weights, biases else: return deconv # 卷積層 
def conv2d(value, output_dim, k_h = 5, k_w = 5, strides =[1, 2, 2, 1], name = 'conv2d'): with tf.variable_scope(name): weights = weight('weights', [k_h, k_w, value.get_shape()[-1], output_dim]) conv = tf.nn.conv2d(value, weights, strides = strides, padding = 'SAME') biases = bias('biases', [output_dim]) conv = tf.reshape(tf.nn.bias_add(conv, biases), conv.get_shape()) return conv # 把約束條件串聯到 feature map
def conv_cond_concat(value, cond, name = 'concat'): # 把張量的維度形狀轉化成 Python 的 list
    value_shapes = value.get_shape().as_list() cond_shapes = cond.get_shape().as_list() # 在第三個維度上（feature map 維度上）把條件和輸入串聯起來，
    # 條件會被預先設爲四維張量的形式，假設輸入爲 [64, 32, 32, 32] 維的張量，
    # 條件爲 [64, 32, 32, 10] 維的張量，那麼輸出就是一個 [64, 32, 32, 42] 維張量
 with tf.variable_scope(name): return tf.concat(3, [value, cond * tf.ones(value_shapes[0:3] + cond_shapes[3:])]) # Batch Normalization 層 
def batch_norm_layer(value, is_train = True, name = 'batch_norm'): with tf.variable_scope(name) as scope: if is_train: return batch_norm(value, decay = 0.9, epsilon = 1e-5, scale = True, is_training = is_train, updates_collections = None, scope = scope) else: return batch_norm(value, decay = 0.9, epsilon = 1e-5, scale = True, is_training = is_train, reuse = True, updates_collections = None, scope = scope)

TensorFlow 裏使用 Batch Normalization 層，有不少種方法，這裏咱們直接使用官方 contrib 裏面的層，其中 decay 指的是滑動平均的 decay，epsilon 做用是加到分母 variance 上避免分母爲零，scale 是個布爾變量，若是爲真值 True，結果要乘以 gamma，不然 gamma 不使用，is_train 也是布爾變量，爲真值表明訓練過程，不然表明測試過程（在 BN 層中，訓練過程和測試過程是不一樣的，具體請參考論文：https://arxiv.org/abs/1502.03167）。關於 batch_norm 的其餘的參數，請看參考文獻2。網絡

參考文獻：dom

1. https://github.com/carpedm20/DCGAN-tensorflowide

2. https://github.com/tensorflow/tensorflow/blob/b826b79718e3e93148c3545e7aa3f90891744cc0/tensorflow/contrib/layers/python/layers/layers.py#L100 測試