TensorFlow構建卷積神經網絡／模型保存與加載／正則化

時間 2019-12-08

原文原文鏈接

TensorFlow

官方文檔：https://www.tensorflow.org/api_guides/python/math_opshtml

# Arithmetic Operators
import tensorflow as tf

# 用 tf.session.run() 裏 feed_dict 參數設置佔位 tensor, 若是傳入 feed_dict的數據與 tensor 類型不符，就沒法被正確處理
x = tf.placeholder(tf.string)
y = tf.placeholder(tf.int32)
z = tf.placeholder(tf.float32)

with tf.Session() as sess:
    output = sess.run(x, feed_dict={x: 'Test String', y: 123, z: 45.67})

# 數學運算，類型轉換
tf.subtract(tf.cast(tf.constant(2.0), tf.int32), tf.constant(1))   # 1

# 線性分類函數
# tf.Variable 類建立一個 tensor，它的初始值能夠被改變，就像普通的 Python 變量同樣。tensor 把它的狀態存在 session裏，因此你必須手動初始化它的狀態。你用tf.global_variables_initializer() 來初始化全部可變 tensors。
# tf.global_variables_initializer() 會返回一個操做，它會從graph中初始化全部的 TensorFlow 變量。你能夠經過 session 來呼叫這個操做來初始化全部上面的變量
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    
# 從正態分佈中選擇權重能夠避免任意一個權重與其餘權重相比有壓倒性的特性
# tf.truncated_normal()函數從一個正態分佈中產生隨機數，
# tf.truncated_normal() 返回一個 tensor，它的隨機值取自一個正態分佈，而且它們的取值會在這個正態分佈平均值的兩個標準差以內。
n_features = 120
n_labels = 5
weights = tf.Variable(tf.truncated_normal((n_features, n_labels)))

# 由於權重已經被隨機化來幫助模型不被卡住，你不須要再把誤差隨機化了。讓咱們簡單地把誤差設爲 0。
n_labels = 5
bias = tf.Variable(tf.zeros(n_labels))
# 變量的統一初始化：
sess.run(tf.global_variables_initializer())

# 矩陣乘法：
tf.matmul(input, w)

# TensorFlow Softmax
# x = tf.nn.softmax([2.0, 1.0, 0.2])
output = None
logit_data = [2.0, 1.0, 0.1]
logits = tf.placeholder(tf.float32)
with tf.Session() as sess:
    output = sess.run(tf.nn.softmax(logits), feed_dict={logits:logit_data})

# TensorFlow 中的交叉熵（Cross Entropy）
# tf.reduce_sum() 函數輸入一個序列，返回他們的和
# tf.reduce_mean()計算序列均值
# tf.log() 返回所輸入值的天然對數
softmax_data = [0.7, 0.2, 0.1]
one_hot_data = [1.0, 0.0, 0.0]

softmax = tf.placeholder(tf.float32)
one_hot = tf.placeholder(tf.float32)

cross_entropy = -tf.reduce_sum(tf.multiply(softmax, one_hot))
# TODO: Print cross entropy from session
with tf.Session() as session:
    output = session.run(cross_entropy, feed_dict={softmax: softmax_data, one_hot: one_hot_data})
    print(output)

TensorFlow Mini-batching

# 有時候不可能把數據徹底分割成相同數量的 batch。例若有 1000 個數據點，你想每一個 batch 有 128 個數據。可是1000 沒法被 128 整除。你獲得的結果是 7 batch，每一個128個數據點，一個 batch 有 104個數據點。(7*128 + 1*104 = 1000)

# batch裏面的數據點數量會不一樣的狀況下，你須要利用 TensorFlow 的tf.placeholder() 函數來接收這些不一樣的 batch
# 若是每一個樣本有n_input = 784特徵，n_classes = 10個可能的標籤，features的維度應該是[None, n_input]，labels的維度是 [None, n_classes]

# Features and Labels
# None 維度在這裏是一個 batch size 的佔位符。在運行時，TensorFlow 會接收任何大於 0 的 batch size
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

學習率和epochs

學習率太高，在相同的epochs條件下準確率會過早的中止改進，致使最終準確率會低
下降學習率須要更多的 epochs，可是能夠最終獲得更好的準確率

隱藏單元數目的選擇：

文章：https://www.quora.com/How-do-I-decide-the-number-of-nodes-in-a-hidden-layer-of-a-neural-network-I-will-be-using-a-three-layer-model

The number of hidden nodes you should have is based on a complex relationship between:
1. Number of input and output nodes
2. Amount of training data available
3. Complexity of the function that is trying to be learned
4. The training algorithm
Too few nodes will lead to high error for your system as the predictive factors might be too complex for a small number of nodes to capture
Too many nodes will overfit to your training data and not generalize well
some general advice: How many hidden units should I use?(詳細)
http://www.faqs.org/faqs/ai-faq/neural-nets/part3/section-10.html
1. The number of hidden nodes in each layer should be somewhere between the size of the input and output layer, potentially the mean.
2. The number of hidden nodes shouldn't need to exceed twice the number of input nodes, as you are probably grossly overfitting at this point.

對隱藏層單元個數的選擇主要是要知足：能讓網絡準確預測，使其具備泛化能力，但不會過擬合。
隱藏單元數目通常不超過輸入單元的2倍，太高可能引發overfitting，並且開銷很高，可是也不能太小，太小的話會影響到模型的擬合效果

學習率的選擇
1. 對學習速率的選擇知足：能使網絡成功收斂，且仍然具備較高的時間效率
2. 學習速率self.lr的經驗公式是： self.lr / n_records 趨近於 0.01，能夠參考經驗公式嘗試不一樣的值,如batch=128，則選擇lr=1左右
3. 學習速率的優化注意要與其餘2個超參數配合調整，特別是迭代次數，單個參數最優不必定最後結果最優，須要3個參數配合調整，以達到最優結果
TensorFlow Examples:
https://github.com/aymericdamien/TensorFlow-Examplesnode
TensorFlow中構建深度神經網絡

# 一個demo
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(".", one_hot=True, reshape=False)

import tensorflow as tf

# 參數 Parameters
learning_rate = 0.001
training_epochs = 20
batch_size = 128  # 若是沒有足夠內存，能夠下降 batch size
display_step = 1

n_input = 784  # MNIST data input (img shape: 28*28)
n_classes = 10  # MNIST total classes (0-9 digits)

n_hidden_layer = 256 # layer number of features 特徵的層數

# Store layers weight & bias
# 層權重和偏置項的儲存
weights = {
    'hidden_layer': tf.Variable(tf.random_normal([n_input, n_hidden_layer])),
    'out': tf.Variable(tf.random_normal([n_hidden_layer, n_classes]))
}
biases = {
    'hidden_layer': tf.Variable(tf.random_normal([n_hidden_layer])),
    'out': tf.Variable(tf.random_normal([n_classes]))
}

# tf Graph input
x = tf.placeholder("float", [None, 28, 28, 1])
y = tf.placeholder("float", [None, n_classes])

x_flat = tf.reshape(x, [-1, n_input])

# Hidden layer with RELU activation
# ReLU做爲隱藏層激活函數
layer_1 = tf.add(tf.matmul(x_flat, weights['hidden_layer']),\
    biases['hidden_layer'])
layer_1 = tf.nn.relu(layer_1)
# Output layer with linear activation
# 輸出層的線性激活函數
logits = tf.add(tf.matmul(layer_1, weights['out']), biases['out'])

# Define loss and optimizer
# 定義偏差值和優化器
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# Initializing the variables
# 初始化變量
init = tf.global_variables_initializer()

# Launch the graph
# 啓動圖
with tf.Session() as sess:
    sess.run(init)
    # Training cycle
    # 訓練循環
    for epoch in range(training_epochs):
        total_batch = int(mnist.train.num_examples/batch_size)
        # Loop over all batches
        # 遍歷全部 batch
        for i in range(total_batch):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            # 運行優化器進行反向傳導、計算 cost（獲取 loss 值）
            sess.run(optimizer, feed_dict={x: batch_x, y: batch_y})
            
# 隱藏層單元數：隱藏層寬度
# 隱藏層層數：隱藏層深度

保存和讀取 TensorFlow 模型

保存變量

# 保存變量
# weights 和 bias Tensors 用 tf.truncated_normal() 函數設定了隨機值。用 tf.train.Saver.save() 函數把這些值被保存在save_file 位置，命名爲 "model.ckpt"，（".ckpt" 擴展名錶示"checkpoint"）。

import tensorflow as tf

# The file path to save the data
# 文件保存路徑
save_file = './model.ckpt'

# Two Tensor Variables: weights and bias
# 兩個 Tensor 變量：權重和偏置項
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
# 用來存取 Tensor 變量的類
saver = tf.train.Saver()

with tf.Session() as sess:
    # Initialize all the Variables
    # 初始化全部變量
    sess.run(tf.global_variables_initializer())

    # Show the values of weights and bias
   # 顯示變量和權重
    print('Weights:')
    print(sess.run(weights))
    print('Bias:')
    print(sess.run(bias))

    # Save the model
    # 保存模型
    saver.save(sess, save_file)

加載變量

# 加載變量
# 注意，你依然須要在 Python 中建立 weights 和 bias Tensors。tf.train.Saver.restore() 函數把以前保存的數據加載到 weights 和 bias 當中。
# 由於 tf.train.Saver.restore() 設定了 TensorFlow 變量，這裏你不須要調用 tf.global_variables_initializer()了。
# Remove the previous weights and bias
# 移除以前的權重和偏置項
tf.reset_default_graph()

# Two Variables: weights and bias
# 兩個變量：權重和偏置項
weights = tf.Variable(tf.truncated_normal([2, 3]))
bias = tf.Variable(tf.truncated_normal([3]))

# Class used to save and/or restore Tensor Variables
# 用來存取 Tensor 變量的類
saver = tf.train.Saver()

with tf.Session() as sess:
    # Load the weights and bias
    # 加載權重和偏置項
    saver.restore(sess, save_file)

    # Show the values of weights and bias
    # 顯示權重和偏置項
    print('Weight:')
    print(sess.run(weights))
    print('Bias:')
    print(sess.run(bias))

保存一個訓練好的模型

# 保存一個訓練好的模型
# Remove previous Tensors and Operations
# 移除以前的  Tensors 和運算
tf.reset_default_graph()

from tensorflow.examples.tutorials.mnist import input_data
import numpy as np

learning_rate = 0.001
n_input = 784  # MNIST 數據輸入 (圖片尺寸: 28*28)
n_classes = 10  # MNIST 總計類別 (數字 0-9)

# Import MNIST data
# 加載 MNIST 數據
mnist = input_data.read_data_sets('.', one_hot=True)

# Features and Labels
# 特徵和標籤
features = tf.placeholder(tf.float32, [None, n_input])
labels = tf.placeholder(tf.float32, [None, n_classes])

# Weights & bias
# 權重和偏置項
weights = tf.Variable(tf.random_normal([n_input, n_classes]))
bias = tf.Variable(tf.random_normal([n_classes]))

# Logits - xW + b
logits = tf.add(tf.matmul(features, weights), bias)

# Define loss and optimizer
# 定義損失函數和優化器
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=labels))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# Calculate accuracy
# 計算準確率
correct_prediction = tf.equal(tf.argmax(logits, 1), tf.argmax(labels, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# 訓練模型並保存權重：
save_file = './train_model.ckpt'
batch_size = 128
n_epochs = 100

saver = tf.train.Saver()

# Launch the graph
# 啓動圖
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    # Training cycle
    # 訓練循環
    for epoch in range(n_epochs):
        total_batch = math.ceil(mnist.train.num_examples / batch_size)

        # Loop over all batches
        # 遍歷全部 batch
        for i in range(total_batch):
            batch_features, batch_labels = mnist.train.next_batch(batch_size)
            sess.run(
                optimizer,
                feed_dict={features: batch_features, labels: batch_labels})

        # Print status for every 10 epochs
        # 每運行10個 epoch 打印一次狀態
        if epoch % 10 == 0:
            valid_accuracy = sess.run(
                accuracy,
                feed_dict={
                    features: mnist.validation.images,
                    labels: mnist.validation.labels})
            print('Epoch {:<3} - Validation Accuracy: {}'.format(
                epoch,
                valid_accuracy))

    # Save the model
    # 保存模型
    saver.save(sess, save_file)
    print('Trained Model Saved.')

加載訓練好的模型

# 加載訓練好的模型
saver = tf.train.Saver()

# Launch the graph
# 加載圖
with tf.Session() as sess:
    saver.restore(sess, save_file)

    test_accuracy = sess.run(
        accuracy,
        feed_dict={features: mnist.test.images, labels: mnist.test.labels})

print('Test Accuracy: {}'.format(test_accuracy))

把權重和偏置項加載到新模型中

# 不少時候你想調整，或者說「微調」一個你已經訓練並保存了的模型。可是，把保存的變量直接加載到已經修改過的模型會產生錯誤
# TensorFlow 對 Tensor 和計算使用一個叫 name 的字符串辨識器，若是沒有定義 name，TensorFlow 會自動建立一個.容易出現命名錯誤

# 手動設定name屬性：
import tensorflow as tf

tf.reset_default_graph()

save_file = 'model.ckpt'

# Two Tensor Variables: weights and bias
# 兩個 Tensor 變量：權重和偏置項
weights = tf.Variable(tf.truncated_normal([2, 3]), name='weights_0')
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias
# 打印權重和偏置項的名稱
print('Save Weights: {}'.format(weights.name))
print('Save Bias: {}'.format(bias.name))

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    saver.save(sess, save_file)

# Remove the previous weights and bias
# 移除以前的權重和偏置項
tf.reset_default_graph()

# Two Variables: weights and bias
# 兩個變量：權重和偏置項
bias = tf.Variable(tf.truncated_normal([3]), name='bias_0')
weights = tf.Variable(tf.truncated_normal([2, 3]) ,name='weights_0')

saver = tf.train.Saver()

# Print the name of Weights and Bias
# 打印權重和偏置項的名稱
print('Load Weights: {}'.format(weights.name))
print('Load Bias: {}'.format(bias.name))

with tf.Session() as sess:
    # Load the weights and bias - No Error
    # 加載權重和偏置項 - 沒有報錯
    saver.restore(sess, save_file)

print('Loaded Weights and Bias successfully.')

Save Weights: weights_0:0python

Save Bias: bias_0:0git

Load Weights: weights_0:0github

Load Bias: bias_0:0api

Loaded Weights and Bias successfully.網絡

TensorFlow Dropout

# Dropout 是一個下降過擬合的正則化技術。它在網絡中暫時的丟棄一些單元（神經元），以及與它們的先後相連的全部節點
# TensorFlow 提供了一個 tf.nn.dropout() 函數，你能夠用來實現 dropout

keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

# tf.nn.dropout()函數有兩個參數：
# hidden_layer：你要應用 dropout 的 tensor
# keep_prob：任何一個給定單元的留存率（沒有被丟棄的單元）

# keep_prob 可讓你調整丟棄單元的數量。爲了補償被丟棄的單元，tf.nn.dropout() 把全部保留下來的單元（沒有被丟棄的單元）* 1/keep_prob
# 在訓練時，一個好的keep_prob初始值是0.5。

# 在測試時，把 keep_prob 值設爲1.0 ，這樣保留全部的單元，最大化模型的能力

一個demo：session

keep_prob = tf.placeholder(tf.float32) # probability to keep units

hidden_layer = tf.add(tf.matmul(features, weights[0]), biases[0])
hidden_layer = tf.nn.relu(hidden_layer)
hidden_layer = tf.nn.dropout(hidden_layer, keep_prob)

logits = tf.add(tf.matmul(hidden_layer, weights[1]), biases[1])

...

with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())

    for epoch_i in range(epochs):
        for batch_i in range(batches):
            ....

            sess.run(optimizer, feed_dict={
                features: batch_features,
                labels: batch_labels,
                keep_prob: 0.5})

    validation_accuracy = sess.run(accuracy, feed_dict={
        features: test_features,
        labels: test_labels,
        keep_prob: 1.0})

TensorFlow卷積神經網絡dom

設置
H = height, W = width, D = depthide

咱們有一個輸入維度是 32x32x3 (HxWxD)
20個維度爲 8x8x3 (HxWxD) 的濾波器
高和寬的stride（步長）都爲 2。(S)
padding 大小爲1 (P)
計算新的高度和寬度的公式是：

new_height = (input_height - filter_height + 2 * P)/S + 1

new_width = (input_width - filter_width + 2 * P)/S + 1

輸出層的大小： 14x14x20

input = tf.placeholder(tf.float32, (None, 32, 32, 3))
filter_weights = tf.Variable(tf.truncated_normal((8, 8, 3, 20))) # (height, width, input_depth, output_depth)
filter_bias = tf.Variable(tf.zeros(20))
strides = [1, 2, 2, 1] # (batch, height, width, depth)
padding = 'VALID'
conv = tf.nn.conv2d(input, filter_weights, strides, padding) + filter_bias


# TensorFlow 使用以下等式計算 SAME 、PADDING

# SAME Padding， 輸出的高和寬，計算以下：

out_height = ceil(float(in_height) / float(strides1))

out_width = ceil(float(in_width) / float(strides[2]))

# VALID Padding， 輸出的高和寬，計算以下：

out_height = ceil(float(in_height - filter_height + 1) / float(strides1))

out_width = ceil(float(in_width - filter_width + 1) / float(strides[2]))

# ceil：返回大於或者等於指定表達式的最小整數

TensorFlow 卷積層實現

# TensorFlow 提供了 tf.nn.conv2d() 和 tf.nn.bias_add() 函數來建立你本身的卷積層

# Output depth
k_output = 64

# Image Properties
image_width = 10
image_height = 10
color_channels = 3

# Convolution filter
filter_size_width = 5
filter_size_height = 5

# Input/Image
input = tf.placeholder(
    tf.float32,
    shape=[None, image_height, image_width, color_channels])

# Weight and bias
weight = tf.Variable(tf.truncated_normal(
    [filter_size_height, filter_size_width, color_channels, k_output]))
bias = tf.Variable(tf.zeros(k_output))

# Apply Convolution
conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
# Add bias
conv_layer = tf.nn.bias_add(conv_layer, bias)
# Apply activation function
conv_layer = tf.nn.relu(conv_layer)

# 上述代碼用了 tf.nn.conv2d() 函數來計算卷積，weights 做爲濾波器，[1, 2, 2, 1] 做爲 strides。
# TensorFlow 對每個 input 維度使用一個單獨的 stride 參數，[batch, input_height, input_width, input_channels]。咱們一般把 batch 和 input_channels （strides 序列中的第一個第四個）的 stride 設爲 1
# input_height 和 input_width strides 表示濾波器在input 上移動的步長

# tf.nn.bias_add() 函數對矩陣的最後一維加了偏置項

TensorFlow max pooling 最大池化

Decrease the size of the output and prevent overfitting. Reducing overfitting is a consequence of the reducing the output size, which in turn, reduces the number of parameters in future layers
池化做用：減少輸出大小和下降過擬合。下降過擬合是減少輸出大小的結果，它一樣也減小了後續層中的參數的數量。

近期，池化層並非很受青睞。部分緣由是：

如今的數據集又大又複雜，咱們更關心欠擬合問題。

Dropout 是一個更好的正則化方法。
池化致使信息損失。想一想最大池化的例子，n 個數字中咱們只保留最大的，把餘下的 n-1 徹底捨棄了。

# TensorFlow 提供了 tf.nn.max_pool() 函數，用於對卷積層實現 最大池化 

# tf.nn.max_pool() 函數實現最大池化時， ksize參數是濾波器大小，strides參數是步長。2x2 的濾波器配合 2x2 的步長是經常使用設定。

# ksize 和 strides 參數也被構建爲四個元素的列表，每一個元素對應 input tensor 的一個維度 ([batch, height, width, channels])，對 ksize 和 strides 來講，batch 和 channel 一般都設置成 1。

# 注意：池化層的輸出深度與輸入的深度相同。另外池化操做是分別應用到每個深度切片層

conv_layer = tf.nn.conv2d(input, weight, strides=[1, 2, 2, 1], padding='SAME')
conv_layer = tf.nn.bias_add(conv_layer, bias)
conv_layer = tf.nn.relu(conv_layer)
# Apply Max Pooling
conv_layer = tf.nn.max_pool(
    conv_layer,
    ksize=[1, 2, 2, 1],
    strides=[1, 2, 2, 1],
    padding='SAME')

一個minist數據集上的三層卷積神經網絡demo：

# 數據集
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets(".", one_hot=True, reshape=False)

import tensorflow as tf

# Parameters
# 參數
learning_rate = 0.00001
epochs = 10
batch_size = 128

# Number of samples to calculate validation and accuracy
# Decrease this if you're running out of memory to calculate accuracy
# 用來驗證和計算準確率的樣本數
# 若是內存不夠，能夠調小這個數字
test_valid_size = 256

#####################################################
# Network Parameters
# 神經網絡參數
n_classes = 10  # MNIST total classes (0-9 digits)
dropout = 0.75  # Dropout, probability to keep units

# weights and biases
# Store layers weight & bias
weights = {
    'wc1': tf.Variable(tf.random_normal([5, 5, 1, 32])),
    'wc2': tf.Variable(tf.random_normal([5, 5, 32, 64])),
    'wd1': tf.Variable(tf.random_normal([7*7*64, 1024])),
    'out': tf.Variable(tf.random_normal([1024, n_classes]))}

biases = {
    'bc1': tf.Variable(tf.random_normal([32])),
    'bc2': tf.Variable(tf.random_normal([64])),
    'bd1': tf.Variable(tf.random_normal([1024])),
    'out': tf.Variable(tf.random_normal([n_classes]))}

##################################################### 
# 卷積 
def conv2d(x, W, b, strides=1):
    x = tf.nn.conv2d(x, W, strides=[1, strides, strides, 1], padding='SAME')
    x = tf.nn.bias_add(x, b)
    return tf.nn.relu(x)
# 最大池化
def maxpool2d(x, k=2):
    return tf.nn.max_pool(
        x,
        ksize=[1, k, k, 1],
        strides=[1, k, k, 1],
        padding='SAME')

#####################################################
# 模型
# 建立了 3 層來實現卷積，最大池化以及全連接層和輸出層
def conv_net(x, weights, biases, dropout):
    # Layer 1 - 28*28*1 to 14*14*32
    conv1 = conv2d(x, weights['wc1'], biases['bc1'])
    conv1 = maxpool2d(conv1, k=2)

    # Layer 2 - 14*14*32 to 7*7*64
    conv2 = conv2d(conv1, weights['wc2'], biases['bc2'])
    conv2 = maxpool2d(conv2, k=2)

    # Fully connected layer - 7*7*64 to 1024
    fc1 = tf.reshape(conv2, [-1, weights['wd1'].get_shape().as_list()[0]])
    fc1 = tf.add(tf.matmul(fc1, weights['wd1']), biases['bd1'])
    fc1 = tf.nn.relu(fc1)
    fc1 = tf.nn.dropout(fc1, dropout)

    # Output Layer - class prediction - 1024 to 10
    out = tf.add(tf.matmul(fc1, weights['out']), biases['out'])
    return out
#####################################################  
# Session
# tf Graph input
x = tf.placeholder(tf.float32, [None, 28, 28, 1])
y = tf.placeholder(tf.float32, [None, n_classes])
keep_prob = tf.placeholder(tf.float32)

# Model
logits = conv_net(x, weights, biases, keep_prob)

# Define loss and optimizer
cost = tf.reduce_mean(\
    tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y))
optimizer = tf.train.GradientDescentOptimizer(learning_rate=learning_rate)\
    .minimize(cost)

# Accuracy
correct_pred = tf.equal(tf.argmax(logits, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))

# Initializing the variables
init = tf. global_variables_initializer()

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    for epoch in range(epochs):
        for batch in range(mnist.train.num_examples//batch_size):
            batch_x, batch_y = mnist.train.next_batch(batch_size)
            sess.run(optimizer, feed_dict={
                x: batch_x,
                y: batch_y,
                keep_prob: dropout})

            # Calculate batch loss and accuracy
            loss = sess.run(cost, feed_dict={
                x: batch_x,
                y: batch_y,
                keep_prob: 1.})
            valid_acc = sess.run(accuracy, feed_dict={
                x: mnist.validation.images[:test_valid_size],
                y: mnist.validation.labels[:test_valid_size],
                keep_prob: 1.})

            print('Epoch {:>2}, Batch {:>3} -'
                  'Loss: {:>10.4f} Validation Accuracy: {:.6f}'.format(
                epoch + 1,
                batch + 1,
                loss,
                valid_acc))

    # Calculate Test Accuracy
    test_acc = sess.run(accuracy, feed_dict={
        x: mnist.test.images[:test_valid_size],
        y: mnist.test.labels[:test_valid_size],
        keep_prob: 1.})
    print('Testing Accuracy: {}'.format(test_acc))