[譯] TensorFlow 教程 #06 - CIFAR-10

題圖來自:github
本文主要演示了在CIFAR-10數據集上進行圖像識別
其中有大段以前教程的文字及代碼,若是看過的朋友能夠快速翻閱。html

01 - 簡單線性模型 | 02 - 卷積神經網絡 | 03 - PrettyTensor | 04 - 保存& 恢復
05 - 集成學習 python

by Magnus Erik Hvass Pedersen / GitHub / Videos on YouTube
中文翻譯 thrillerist / Githubgit

簡介

這篇教程介紹瞭如何建立一個在CIRAR-10數據集上進行圖像分類的卷積神經網絡。同時也說明了在訓練和測試時如何使用不一樣的網絡。github

本文基於上一篇教程,你須要瞭解基本的TensorFlow和附加包Pretty Tensor。其中大量代碼和文字與以前教程類似,若是你已經看過能夠快速地瀏覽本文。數組

流程圖

下面的圖表直接顯示了以後實現的卷積神經網絡中數據的傳遞。首先有一個扭曲(distorts)輸入圖像的預處理層,用來人爲地擴大訓練集。接着有兩個卷積層,兩個全鏈接層和一個softmax分類層。在後面會有更大的圖示來顯示權重和卷積層的輸出,教程 #02 有卷積如何工做的更多細節。bash

在這種狀況下圖像是誤分類的。圖像上有一隻狗,但神經網絡不肯定它是狗仍是貓,認爲更有多是貓。網絡

from IPython.display import Image
Image('images/06_network_flowchart.png')複製代碼

導入

%matplotlib inline
import matplotlib.pyplot as plt
import tensorflow as tf
import numpy as np
from sklearn.metrics import confusion_matrix
import time
from datetime import timedelta
import math
import os

# Use PrettyTensor to simplify Neural Network construction.
import prettytensor as pt複製代碼

使用Python3.5.2(Anaconda)開發,TensorFlow版本是:session

tf.__version__複製代碼

'0.12.0-rc0'app

PrettyTensor 版本:dom

pt.__version__複製代碼

'0.7.1'

載入數據

import cifar10複製代碼

設置電腦上保存數據集的路徑。

# cifar10.data_path = "data/CIFAR-10/"複製代碼

CIFAR-10數據集大概有163MB,若是給定路徑沒有找到文件的話,將會自動下載。

cifar10.maybe_download_and_extract()複製代碼

Data has apparently already been downloaded and unpacked.

載入分類名稱。

class_names = cifar10.load_class_names()
class_names複製代碼

Loading data: data/CIFAR-10/cifar-10-batches-py/batches.meta
['airplane',
'automobile',
'bird',
'cat',
'deer',
'dog',
'frog',
'horse',
'ship',
'truck']

載入訓練集。這個函數返回圖像、整形分類號碼、以及用One-Hot編碼的分類號數組,稱爲標籤。

images_train, cls_train, labels_train = cifar10.load_training_data()複製代碼

Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_1
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_2
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_3
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_4
Loading data: data/CIFAR-10/cifar-10-batches-py/data_batch_5

載入測試集。

images_test, cls_test, labels_test = cifar10.load_test_data()複製代碼

Loading data: data/CIFAR-10/cifar-10-batches-py/test_batch

如今已經載入了CIFAR-10數據集,它包含60,000張圖像以及相關的標籤(圖像的分類)。數據集被分爲兩個獨立的子集,即訓練集和測試集。

print("Size of:")
print("- Training-set:\t\t{}".format(len(images_train)))
print("- Test-set:\t\t{}".format(len(images_test)))複製代碼

Size of:

  • Training-set: 50000
  • Test-set: 10000

數據維度

下面的代碼中屢次用到數據維度。cirfa10模塊中已經定義好了這些,所以咱們只須要import進來。

from cifar10 import img_size, num_channels, num_classes複製代碼

圖像是32 x 32像素的,但咱們將圖像裁剪至24 x 24像素。

img_size_cropped = 24複製代碼

用來繪製圖片的幫助函數

這個函數用來在3x3的柵格中畫9張圖像,而後在每張圖像下面寫出真實類別和預測類別。

def plot_images(images, cls_true, cls_pred=None, smooth=True):

    assert len(images) == len(cls_true) == 9

    # Create figure with sub-plots.
    fig, axes = plt.subplots(3, 3)

    # Adjust vertical spacing if we need to print ensemble and best-net.
    if cls_pred is None:
        hspace = 0.3
    else:
        hspace = 0.6
    fig.subplots_adjust(hspace=hspace, wspace=0.3)

    for i, ax in enumerate(axes.flat):
        # Interpolation type.
        if smooth:
            interpolation = 'spline16'
        else:
            interpolation = 'nearest'

        # Plot image.
        ax.imshow(images[i, :, :, :],
                  interpolation=interpolation)

        # Name of the true class.
        cls_true_name = class_names[cls_true[i]]

        # Show true and predicted classes.
        if cls_pred is None:
            xlabel = "True: {0}".format(cls_true_name)
        else:
            # Name of the predicted class.
            cls_pred_name = class_names[cls_pred[i]]

            xlabel = "True: {0}\nPred: {1}".format(cls_true_name, cls_pred_name)

        # Show the classes as the label on the x-axis.
        ax.set_xlabel(xlabel)

        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()複製代碼

繪製幾張圖像來看看數據是否正確

# Get the first images from the test-set.
images = images_test[0:9]

# Get the true classes for those images.
cls_true = cls_test[0:9]

# Plot the images and labels using our helper-function above.
plot_images(images=images, cls_true=cls_true, smooth=False)複製代碼

上面像素化的圖像是神經網絡的輸入。若是咱們對圖像進行平滑處理,可能更易於人眼識別。

plot_images(images=images, cls_true=cls_true, smooth=True)複製代碼

TensorFlow圖

TensorFlow的所有目的就是使用一個稱之爲計算圖(computational graph)的東西,它會比直接在Python中進行相同計算量要高效得多。TensorFlow比Numpy更高效,由於TensorFlow瞭解整個須要運行的計算圖,然而Numpy只知道某個時間點上惟一的數學運算。

TensorFlow也可以自動地計算須要優化的變量的梯度,使得模型有更好的表現。這是因爲圖是簡單數學表達式的結合,所以整個圖的梯度能夠用鏈式法則推導出來。

TensorFlow還能利用多核CPU和GPU,Google也爲TensorFlow製造了稱爲TPUs(Tensor Processing Units)的特殊芯片,它比GPU更快。

一個TensorFlow圖由下面幾個部分組成,後面會詳細描述:

  • 佔位符變量(Placeholder)用來改變圖的輸入。
  • 模型變量(Model)將會被優化,使得模型表現得更好。
  • 模型本質上就是一些數學函數,它根據Placeholder和模型的輸入變量來計算一些輸出。
  • 一個cost度量用來指導變量的優化。
  • 一個優化策略會更新模型的變量。

另外,TensorFlow圖也包含了一些調試狀態,好比用TensorBoard打印log數據,本教程不涉及這些。

佔位符 (Placeholder)變量

Placeholder是做爲圖的輸入,咱們每次運行圖的時候均可能改變它們。將這個過程稱爲feeding placeholder變量,後面將會描述這個。

首先咱們爲輸入圖像定義placeholder變量。這讓咱們能夠改變輸入到TensorFlow圖中的圖像。這也是一個張量(tensor),表明一個多維向量或矩陣。數據類型設置爲float32,形狀設爲[None, img_size, img_size, num_channels]表明tensor可能保存着任意數量的圖像,每張圖像寬高都爲img_size,有num_channels個顏色通道。

x = tf.placeholder(tf.float32, shape=[None, img_size, img_size, num_channels], name='x')複製代碼

接下來咱們爲輸入變量x中的圖像所對應的真實標籤訂義placeholder變量。變量的形狀是[None, num_classes],這表明着它保存了任意數量的標籤,每一個標籤是長度爲num_classes的向量,本例中長度爲10。

y_true = tf.placeholder(tf.float32, shape=[None, num_classes], name='y_true')複製代碼

咱們也能夠爲class-number提供一個placeholder,但這裏用argmax來計算它。這裏只是TensorFlow中的一些操做,沒有執行什麼運算。

y_true_cls = tf.argmax(y_true, dimension=1)複製代碼

預處理的幫助函數

下面的幫助函數建立了用來預處理輸入圖像的TensorFlow計算圖。這裏並未執行計算,函數只是給TensorFlow計算圖添加了節點。

神經網絡在訓練和測試階段的預處理方法不一樣:

  • 對於訓練來講,輸入圖像是隨機裁剪、水平翻轉的,而且用隨機值來調整色調、對比度和飽和度。這樣就建立了原始輸入圖像的隨機變體,人爲地擴充了訓練集。後面會顯示一些扭曲過的圖像樣本。

  • 對於測試,輸入圖像根據中心裁剪,其餘不做調整。

def pre_process_image(image, training):
    # This function takes a single image as input,
    # and a boolean whether to build the training or testing graph.

    if training:
        # For training, add the following to the TensorFlow graph.

        # Randomly crop the input image.
        image = tf.random_crop(image, size=[img_size_cropped, img_size_cropped, num_channels])

        # Randomly flip the image horizontally.
        image = tf.image.random_flip_left_right(image)

        # Randomly adjust hue, contrast and saturation.
        image = tf.image.random_hue(image, max_delta=0.05)
        image = tf.image.random_contrast(image, lower=0.3, upper=1.0)
        image = tf.image.random_brightness(image, max_delta=0.2)
        image = tf.image.random_saturation(image, lower=0.0, upper=2.0)

        # Some of these functions may overflow and result in pixel
        # values beyond the [0, 1] range. It is unclear from the
        # documentation of TensorFlow 0.10.0rc0 whether this is
        # intended. A simple solution is to limit the range.

        # Limit the image pixels between [0, 1] in case of overflow.
        image = tf.minimum(image, 1.0)
        image = tf.maximum(image, 0.0)
    else:
        # For training, add the following to the TensorFlow graph.

        # Crop the input image around the centre so it is the same
        # size as images that are randomly cropped during training.
        image = tf.image.resize_image_with_crop_or_pad(image,
                                                       target_height=img_size_cropped,
                                                       target_width=img_size_cropped)

    return image複製代碼

下面函數中,輸入batch中每張圖像都調用以上函數。

def pre_process(images, training):
    # Use TensorFlow to loop over all the input images and call
    # the function above which takes a single image as input.
    images = tf.map_fn(lambda image: pre_process_image(image, training), images)

    return images複製代碼

爲了繪製扭曲過的圖像,咱們爲TensorFlow建立預處理graph,後面將會運行它。

distorted_images = pre_process(images=x, training=True)複製代碼

建立主要處理程序的幫助函數

下面的幫助函數建立了卷積神經網絡的主要部分。這裏使用以前教程描述過的Pretty Tensor。

def main_network(images, training):
    # Wrap the input images as a Pretty Tensor object.
    x_pretty = pt.wrap(images)

    # Pretty Tensor uses special numbers to distinguish between
    # the training and testing phases.
    if training:
        phase = pt.Phase.train
    else:
        phase = pt.Phase.infer

    # Create the convolutional neural network using Pretty Tensor.
    # It is very similar to the previous tutorials, except
    # the use of so-called batch-normalization in the first layer.
    with pt.defaults_scope(activation_fn=tf.nn.relu, phase=phase):
        y_pred, loss = x_pretty.\
            conv2d(kernel=5, depth=64, name='layer_conv1', batch_normalize=True).\
            max_pool(kernel=2, stride=2).\
            conv2d(kernel=5, depth=64, name='layer_conv2').\
            max_pool(kernel=2, stride=2).\
            flatten().\
            fully_connected(size=256, name='layer_fc1').\
            fully_connected(size=128, name='layer_fc2').\
            softmax_classifier(num_classes=num_classes, labels=y_true)

    return y_pred, loss複製代碼

建立神經網絡的幫助函數

下面的幫助函數建立了整個神經網絡,包含上面定義的預處理以及主要處理模塊。

注意,神經網絡被編碼到'network'變量做用域中。由於咱們實際上在TensorFlow圖中建立了兩個神經網絡。像這樣指定一個變量做用域,能夠在兩個神經網絡中複用變量,所以訓練網絡優化過的變量能夠在測試網絡中複用。

def create_network(training):
    # Wrap the neural network in the scope named 'network'.
    # Create new variables during training, and re-use during testing.
    with tf.variable_scope('network', reuse=not training):
        # Just rename the input placeholder variable for convenience.
        images = x

        # Create TensorFlow graph for pre-processing.
        images = pre_process(images=images, training=training)

        # Create TensorFlow graph for the main processing.
        y_pred, loss = main_network(images=images, training=training)

    return y_pred, loss複製代碼

爲訓練階段建立神經網絡

首先建立一個保存當前優化迭代次數的TensorFlow變量。在以前的教程中,是使用一個Python變量,但本教程中,咱們想用checkpoints中的其餘TensorFlow變量來保存。

trainable=False表示TensorFlow不會優化此變量。

global_step = tf.Variable(initial_value=0,
                          name='global_step', trainable=False)複製代碼

建立訓練用的神經網絡。函數 create_network()返回y_predloss,但在訓練時咱們只需用到loss函數。

_, loss = create_network(training=True)複製代碼

建立最小化loss函數的優化器。同時將global_step傳給優化器,這樣每次迭代它都減一。

optimizer = tf.train.AdamOptimizer(learning_rate=1e-4).minimize(loss, global_step=global_step)複製代碼

建立測試階段的神經網絡

如今建立測試階段的神經網絡。 一樣的,create_network() 返回輸入圖像的預測標籤 y_pred,優化過程也用到 loss函數。測試時咱們只須要y_pred

y_pred, _ = create_network(training=False)複製代碼

而後咱們計算預測類別號的整形數字。網絡的輸出y_pred是一個10個元素的數組。類別號是數組中最大元素的索引。

y_pred_cls = tf.argmax(y_pred, dimension=1)複製代碼

而後建立一個布爾向量,用來告訴咱們每張圖片的真實類別是否與預測類別相同。

correct_prediction = tf.equal(y_pred_cls, y_true_cls)複製代碼

上面的計算先將布爾值向量類型轉換成浮點型向量,這樣子False就變成0,True變成1,而後計算這些值的平均數,以此來計算分類的準確率。

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))複製代碼

Saver

爲了保存神經網絡的變量(這樣沒必要再次訓練網絡就能重載),咱們建立一個稱爲Saver-object的對象,它用來保存及恢復TensorFlow圖的全部變量。在這裏並未保存什麼東西,(保存操做)在後面的optimize()函數中完成。

saver = tf.train.Saver()複製代碼

獲取權重

下面,咱們要繪製神經網絡的權重。當使用Pretty Tensor來建立網絡時,層的全部變量都是由Pretty Tensoe間接建立的。所以咱們要從TensorFlow中獲取變量。

咱們用layer_conv1layer_conv2表明兩個卷積層。這也叫變量做用域(不要與上面描述的defaults_scope混淆了)。PrettyTensor會自動給它爲每一個層建立的變量命名,所以咱們能夠經過層的做用域名稱和變量名來取得某一層的權重。

函數實現有點笨拙,由於咱們不得不用TensorFlow函數get_variable(),它是設計給其餘用途的,建立新的變量或重用現有變量。建立下面的幫助函數很簡單。

def get_weights_variable(layer_name):
    # Retrieve an existing variable named 'weights' in the scope
    # with the given layer_name.
    # This is awkward because the TensorFlow function was
    # really intended for another purpose.

    with tf.variable_scope("network/" + layer_name, reuse=True):
        variable = tf.get_variable('weights')

    return variable複製代碼

藉助這個幫助函數咱們能夠獲取變量。這些是TensorFlow的objects。你須要相似的操做來獲取變量的內容: contents = session.run(weights_conv1) ,下面會提到這個。

weights_conv1 = get_weights_variable(layer_name='layer_conv1')
weights_conv2 = get_weights_variable(layer_name='layer_conv2')複製代碼

獲取layer的輸出

一樣的,咱們還須要獲取卷積層的輸出。這個函數與上面獲取權重的函數有所不一樣。這裏咱們找回卷積層輸出的最後一個張量。

def get_layer_output(layer_name):
    # The name of the last operation of the convolutional layer.
    # This assumes you are using Relu as the activation-function.
    tensor_name = "network/" + layer_name + "/Relu:0"

    # Get the tensor with this name.
    tensor = tf.get_default_graph().get_tensor_by_name(tensor_name)

    return tensor複製代碼

取得卷積層的輸出以便以後繪製。

output_conv1 = get_layer_output(layer_name='layer_conv1')
output_conv2 = get_layer_output(layer_name='layer_conv2')複製代碼

運行TensorFlow

建立TensorFlow會話(session)

一旦建立了TensorFlow圖,咱們須要建立一個TensorFlow會話,用來運行圖。

session = tf.Session()複製代碼

初始化或恢復變量

訓練神經網絡會花上很長時間,特別是當你沒有GPU的時候。所以咱們在訓練時保存checkpoints,這樣就能在其餘時間繼續訓練(好比晚上),之後也能夠不用訓練神經網絡就用這些來分析結果。

若是你想從新訓練神經網絡,就須要先刪掉這些checkpoints。

這是用來保存checkpoints的文件夾。

save_dir = 'checkpoints/'複製代碼

若是文件夾不存在則建立。

if not os.path.exists(save_dir):
    os.makedirs(save_dir)複製代碼

這是checkpoints的基本文件名,TensorFlow會在後面添加迭代次數等。

save_path = os.path.join(save_dir, 'cifar10_cnn')複製代碼

試着載入最新的checkpoint。若是checkpoint不存在或改變了TensorFlow圖的話,可能會失敗並拋出異常。

try:
    print("Trying to restore last checkpoint ...")

    # Use TensorFlow to find the latest checkpoint - if any.
    last_chk_path = tf.train.latest_checkpoint(checkpoint_dir=save_dir)

    # Try and load the data in the checkpoint.
    saver.restore(session, save_path=last_chk_path)

    # If we get to this point, the checkpoint was successfully loaded.
    print("Restored checkpoint from:", last_chk_path)
except:
    # If the above failed for some reason, simply
    # initialize all the variables for the TensorFlow graph.
    print("Failed to restore checkpoint. Initializing variables instead.")
    session.run(tf.global_variables_initializer())複製代碼
Trying to restore last checkpoint ...
Restored checkpoint from: checkpoints/cifar10_cnn-150000複製代碼

建立隨機訓練batch的幫助函數

在訓練集中有50,000張圖。用這些圖像計算模型的梯度會花不少時間。所以,在優化器的每次迭代裏只用到了一小部分的圖像。

若是內存耗盡致使電腦死機或變得很慢,你應該試着減小這些數量,但同時可能還須要更優化的迭代。

train_batch_size = 64複製代碼

函數從訓練集中挑選一個隨機的training-batch。

def random_batch():
    # Number of images in the training-set.
    num_images = len(images_train)

    # Create a random index.
    idx = np.random.choice(num_images,
                           size=train_batch_size,
                           replace=False)

    # Use the random index to select random images and labels.
    x_batch = images_train[idx, :, :, :]
    y_batch = labels_train[idx, :]

    return x_batch, y_batch複製代碼

執行優化迭代的幫助函數

函數用來執行必定數量的優化迭代,以此來逐漸改善網絡層的變量。在每次迭代中,會從訓練集中選擇新的一批數據,而後TensorFlow在這些訓練樣本上執行優化。每100次迭代會打印出進度。每1000次迭代後會保存一個checkpoint,最後一次迭代完畢也會保存。

def optimize(num_iterations):
    # Start-time used for printing time-usage below.
    start_time = time.time()

    for i in range(num_iterations):
        # Get a batch of training examples.
        # x_batch now holds a batch of images and
        # y_true_batch are the true labels for those images.
        x_batch, y_true_batch = random_batch()

        # Put the batch into a dict with the proper names
        # for placeholder variables in the TensorFlow graph.
        feed_dict_train = {x: x_batch,
                           y_true: y_true_batch}

        # Run the optimizer using this batch of training data.
        # TensorFlow assigns the variables in feed_dict_train
        # to the placeholder variables and then runs the optimizer.
        # We also want to retrieve the global_step counter.
        i_global, _ = session.run([global_step, optimizer],
                                  feed_dict=feed_dict_train)

        # Print status to screen every 100 iterations (and last).
        if (i_global % 100 == 0) or (i == num_iterations - 1):
            # Calculate the accuracy on the training-batch.
            batch_acc = session.run(accuracy,
                                    feed_dict=feed_dict_train)

            # Print status.
            msg = "Global Step: {0:>6}, Training Batch Accuracy: {1:>6.1%}"
            print(msg.format(i_global, batch_acc))

        # Save a checkpoint to disk every 1000 iterations (and last).
        if (i_global % 1000 == 0) or (i == num_iterations - 1):
            # Save all variables of the TensorFlow graph to a
            # checkpoint. Append the global_step counter
            # to the filename so we save the last several checkpoints.
            saver.save(session,
                       save_path=save_path,
                       global_step=global_step)

            print("Saved checkpoint.")

    # Ending time.
    end_time = time.time()

    # Difference between start and end-times.
    time_dif = end_time - start_time

    # Print the time-usage.
    print("Time usage: " + str(timedelta(seconds=int(round(time_dif)))))複製代碼

用來繪製錯誤樣本的幫助函數

函數用來繪製測試集中被誤分類的樣本。

def plot_example_errors(cls_pred, correct):
    # This function is called from print_test_accuracy() below.

    # cls_pred is an array of the predicted class-number for
    # all images in the test-set.

    # correct is a boolean array whether the predicted class
    # is equal to the true class for each image in the test-set.

    # Negate the boolean array.
    incorrect = (correct == False)

    # Get the images from the test-set that have been
    # incorrectly classified.
    images = images_test[incorrect]

    # Get the predicted classes for those images.
    cls_pred = cls_pred[incorrect]

    # Get the true classes for those images.
    cls_true = cls_test[incorrect]

    # Plot the first 9 images.
    plot_images(images=images[0:9],
                cls_true=cls_true[0:9],
                cls_pred=cls_pred[0:9])複製代碼

繪製混淆(confusion)矩陣的幫助函數

def plot_confusion_matrix(cls_pred):
    # This is called from print_test_accuracy() below.

    # cls_pred is an array of the predicted class-number for
    # all images in the test-set.

    # Get the confusion matrix using sklearn.
    cm = confusion_matrix(y_true=cls_test,  # True class for test-set.
                          y_pred=cls_pred)  # Predicted class.

    # Print the confusion matrix as text.
    for i in range(num_classes):
        # Append the class-name to each line.
        class_name = "({}) {}".format(i, class_names[i])
        print(cm[i, :], class_name)

    # Print the class-numbers for easy reference.
    class_numbers = [" ({0})".format(i) for i in range(num_classes)]
    print("".join(class_numbers))複製代碼

計算分類的幫助函數

這個函數用來計算圖像的預測類別,同時返回一個表明每張圖像分類是否正確的布爾數組。

因爲計算可能會耗費太多內存,就分批處理。若是你的電腦死機了,試着下降batch-size。

# Split the data-set in batches of this size to limit RAM usage.
batch_size = 256

def predict_cls(images, labels, cls_true):
    # Number of images.
    num_images = len(images)

    # Allocate an array for the predicted classes which
    # will be calculated in batches and filled into this array.
    cls_pred = np.zeros(shape=num_images, dtype=np.int)

    # Now calculate the predicted classes for the batches.
    # We will just iterate through all the batches.
    # There might be a more clever and Pythonic way of doing this.

    # The starting index for the next batch is denoted i.
    i = 0

    while i < num_images:
        # The ending index for the next batch is denoted j.
        j = min(i + batch_size, num_images)

        # Create a feed-dict with the images and labels
        # between index i and j.
        feed_dict = {x: images[i:j, :],
                     y_true: labels[i:j, :]}

        # Calculate the predicted class using TensorFlow.
        cls_pred[i:j] = session.run(y_pred_cls, feed_dict=feed_dict)

        # Set the start-index for the next batch to the
        # end-index of the current batch.
        i = j

    # Create a boolean array whether each image is correctly classified.
    correct = (cls_true == cls_pred)

    return correct, cls_pred複製代碼

計算測試集的預測類別。

def predict_cls_test():
    return predict_cls(images = images_test,
                       labels = labels_test,
                       cls_true = cls_test)複製代碼

計算分類準確率的幫助函數

這個函數計算了給定布爾數組的分類準確率,布爾數組表示每張圖像是否被正確分類。好比, cls_accuracy([True, True, False, False, False]) = 2/5 = 0.4。這個函數也返回了正確分類的數量。

def classification_accuracy(correct):
    # When averaging a boolean array, False means 0 and True means 1.
    # So we are calculating: number of True / len(correct) which is
    # the same as the classification accuracy.

    # Return the classification accuracy
    # and the number of correct classifications.
    return correct.mean(), correct.sum()複製代碼

展現性能的幫助函數

函數用來打印測試集上的分類準確率。

爲測試集上的全部圖片計算分類會花費一段時間,所以咱們直接從這個函數裏調用上面的函數,這樣就不用每一個函數都從新計算分類。

def print_test_accuracy(show_example_errors=False, show_confusion_matrix=False):

    # For all the images in the test-set,
    # calculate the predicted classes and whether they are correct.
    correct, cls_pred = predict_cls_test()

    # Classification accuracy and the number of correct classifications.
    acc, num_correct = classification_accuracy(correct)

    # Number of images being classified.
    num_images = len(correct)

    # Print the accuracy.
    msg = "Accuracy on Test-Set: {0:.1%} ({1} / {2})"
    print(msg.format(acc, num_correct, num_images))

    # Plot some examples of mis-classifications, if desired.
    if show_example_errors:
        print("Example errors:")
        plot_example_errors(cls_pred=cls_pred, correct=correct)

    # Plot the confusion matrix, if desired.
    if show_confusion_matrix:
        print("Confusion Matrix:")
        plot_confusion_matrix(cls_pred=cls_pred)複製代碼

繪製卷積權重的幫助函數

def plot_conv_weights(weights, input_channel=0):
    # Assume weights are TensorFlow ops for 4-dim variables
    # e.g. weights_conv1 or weights_conv2.

    # Retrieve the values of the weight-variables from TensorFlow.
    # A feed-dict is not necessary because nothing is calculated.
    w = session.run(weights)

    # Print statistics for the weights.
    print("Min: {0:.5f}, Max: {1:.5f}".format(w.min(), w.max()))
    print("Mean: {0:.5f}, Stdev: {1:.5f}".format(w.mean(), w.std()))

    # Get the lowest and highest values for the weights.
    # This is used to correct the colour intensity across
    # the images so they can be compared with each other.
    w_min = np.min(w)
    w_max = np.max(w)
    abs_max = max(abs(w_min), abs(w_max))

    # Number of filters used in the conv. layer.
    num_filters = w.shape[3]

    # Number of grids to plot.
    # Rounded-up, square-root of the number of filters.
    num_grids = math.ceil(math.sqrt(num_filters))

    # Create figure with a grid of sub-plots.
    fig, axes = plt.subplots(num_grids, num_grids)

    # Plot all the filter-weights.
    for i, ax in enumerate(axes.flat):
        # Only plot the valid filter-weights.
        if i<num_filters:
            # Get the weights for the i'th filter of the input channel.
            # The format of this 4-dim tensor is determined by the
            # TensorFlow API. See Tutorial #02 for more details.
            img = w[:, :, input_channel, i]

            # Plot image.
            ax.imshow(img, vmin=-abs_max, vmax=abs_max,
                      interpolation='nearest', cmap='seismic')

        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()複製代碼

繪製卷積層輸出的幫助函數

def plot_layer_output(layer_output, image):
    # Assume layer_output is a 4-dim tensor
    # e.g. output_conv1 or output_conv2.

    # Create a feed-dict which holds the single input image.
    # Note that TensorFlow needs a list of images,
    # so we just create a list with this one image.
    feed_dict = {x: [image]}

    # Retrieve the output of the layer after inputting this image.
    values = session.run(layer_output, feed_dict=feed_dict)

    # Get the lowest and highest values.
    # This is used to correct the colour intensity across
    # the images so they can be compared with each other.
    values_min = np.min(values)
    values_max = np.max(values)

    # Number of image channels output by the conv. layer.
    num_images = values.shape[3]

    # Number of grid-cells to plot.
    # Rounded-up, square-root of the number of filters.
    num_grids = math.ceil(math.sqrt(num_images))

    # Create figure with a grid of sub-plots.
    fig, axes = plt.subplots(num_grids, num_grids)

    # Plot all the filter-weights.
    for i, ax in enumerate(axes.flat):
        # Only plot the valid image-channels.
        if i<num_images:
            # Get the images for the i'th output channel.
            img = values[0, :, :, i]

            # Plot image.
            ax.imshow(img, vmin=values_min, vmax=values_max,
                      interpolation='nearest', cmap='binary')

        # Remove ticks from the plot.
        ax.set_xticks([])
        ax.set_yticks([])

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()複製代碼

輸入圖像變體的樣本

爲了人爲地增長訓練用的圖像數量,神經網絡預處理獲取輸入圖像的隨機變體。這讓神經網絡在識別和分類圖像時更加靈活。

這是用來繪製輸入圖像變體的幫助函數。

def plot_distorted_image(image, cls_true):
    # Repeat the input image 9 times.
    image_duplicates = np.repeat(image[np.newaxis, :, :, :], 9, axis=0)

    # Create a feed-dict for TensorFlow.
    feed_dict = {x: image_duplicates}

    # Calculate only the pre-processing of the TensorFlow graph
    # which distorts the images in the feed-dict.
    result = session.run(distorted_images, feed_dict=feed_dict)

    # Plot the images.
    plot_images(images=result, cls_true=np.repeat(cls_true, 9))複製代碼

幫助函數獲取測試集圖像以及它的分類號。

def get_test_image(i):
    return images_test[i, :, :, :], cls_test[i]複製代碼

從測試集中取一張圖像以及它的真實類別。

img, cls = get_test_image(16)複製代碼

畫出圖像的9張隨機變體。若是你從新運行代碼,可能會獲得不太同樣的結果。

plot_distorted_image(img, cls)複製代碼

執行優化

個人筆記本電腦是4核的,每一個2GHz。電腦帶有一個GPU,但對TensorFlow來講不太快,所以只用了CPU。在電腦上迭代10,000次大概花了1個小時。本教程中我執行了150,000次優化迭代,共花了15個小時。我讓它在夜裏以及白天的幾個時間段運行。

因爲咱們在優化過程當中保存了checkpoints,從新運行代碼時會載入最後的那個checkpoint,因此能夠先中止,等晚點再繼續執行優化。

if False:
    optimize(num_iterations=1000)複製代碼

結果

在150,000次優化迭代以後,測試集上的分類準確率大約79%-80%。下面畫出了一些誤分類的圖像。其中有一些即便人眼也很難分辨出來,也有一些是合乎情理的錯誤,好比大型車和卡車,貓與狗,但有些錯誤就有點奇怪了。

print_test_accuracy(show_example_errors=True,
                    show_confusion_matrix=True)複製代碼

Accuracy on Test-Set: 79.3% (7932 / 10000)
Example errors:

Confusion Matrix:
[775 20 71 8 14 4 18 10 44 36] (0) airplane
[ 7 914 5 0 3 7 9 3 14 38] (1) automobile
[ 32 2 724 28 42 44 94 17 9 8] (2) bird
[ 18 7 48 508 56 209 99 29 7 19] (3) cat
[ 4 2 45 25 769 29 75 43 3 5] (4) deer
[ 8 6 34 89 35 748 38 32 1 9] (5) dog
[ 4 2 18 9 14 14 930 4 2 3] (6) frog
[ 6 2 23 18 31 55 17 833 0 15] (7) horse
[ 31 29 15 11 8 7 15 0 856 28] (8) ship
[ 13 67 4 5 0 4 7 7 18 875] (9) truck
(0) (1) (2) (3) (4) (5) (6) (7) (8) (9)

卷積權重

下面展現了一些第一個卷積層的權重(或濾波)。共有3個輸入通道,所以有三組(數據),你能夠改變input_channel來改變繪製結果。

權重正值是紅的,負值是藍的。

plot_conv_weights(weights=weights_conv1, input_channel=0)複製代碼
Min:  -0.61643, Max:   0.63949
Mean: -0.00177, Stdev: 0.16469複製代碼

下面展現了一些第二個卷積層的權重(或濾波)。它們比第一個卷積層的權重更接近零,你能夠看到比較低的標準差。

plot_conv_weights(weights=weights_conv2, input_channel=1)複製代碼
Min:  -0.73326, Max:   0.25344
Mean: -0.00394, Stdev: 0.05466複製代碼

卷積層的輸出

繪製圖像的幫助函數。

def plot_image(image):
    # Create figure with sub-plots.
    fig, axes = plt.subplots(1, 2)

    # References to the sub-plots.
    ax0 = axes.flat[0]
    ax1 = axes.flat[1]

    # Show raw and smoothened images in sub-plots.
    ax0.imshow(image, interpolation='nearest')
    ax1.imshow(image, interpolation='spline16')

    # Set labels.
    ax0.set_xlabel('Raw')
    ax1.set_xlabel('Smooth')

    # Ensure the plot is shown correctly with multiple plots
    # in a single Notebook cell.
    plt.show()複製代碼

繪製一張測試集中的圖像。未處理的像素圖像做爲神經網絡的輸入。

img, cls = get_test_image(16)
plot_image(img)複製代碼

將原始圖像做爲神經網絡的輸入,而後畫出第一個卷積層的輸出。

plot_layer_output(output_conv1, image=img)複製代碼

將一樣的圖像做爲輸入,畫出第二個卷積層的輸出。

plot_layer_output(output_conv2, image=img)複製代碼

預測的類別標籤

獲取圖像的預測類別標籤和類別號。

label_pred, cls_pred = session.run([y_pred, y_pred_cls],
                                   feed_dict={x: [img]})複製代碼

打印預測類別標籤。

# Set the rounding options for numpy.
np.set_printoptions(precision=3, suppress=True)

# Print the predicted label.
print(label_pred[0])複製代碼

[ 0. 0. 0. 0.493 0. 0.49 0.006 0.01 0. 0. ]

預測類別標籤是長度爲10的數組,每一個元素表明着神經網絡有多大信心認爲圖像是該類別。

在這個例子中,索引3的值是0.493,5的值爲0.490。這表示神經網絡相信圖像要麼是類別3,要麼是類別5,即貓或狗。

class_names[3]複製代碼

'cat'

class_names[5]複製代碼

'dog'

關閉TensorFlow會話

如今咱們已經用TensorFlow完成了任務,關閉session,釋放資源。

# This has been commented out in case you want to modify and experiment
# with the Notebook without having to restart it.
# session.close()複製代碼

結論

這篇教程介紹瞭如何建立一個在CIRAR-10數據集上進行圖像分類的卷積神經網絡。測試集上的分類準確率大概79-80%。

同時也畫出了卷積層的輸出,但很難看出神經網絡如何分辨並分類圖像。須要更好的可視化技巧。

練習

下面使一些可能會讓你提高TensorFlow技能的一些建議練習。爲了學習如何更合適地使用TensorFlow,實踐經驗是很重要的。

在你對這個Notebook進行改變以前,可能須要先備份一下。

  • 執行10,000次迭代,看看分類準確率如何。將會保存一個checkpoint來儲存TensorFlow圖的全部變量。
  • 再執行100,000次迭代,看看分類準確率有沒有提高。而後再執行100,000次。準確率有提高嗎,你認爲值得這些增長的計算時間嗎?
  • 試着再預處理階段改變圖像的變體。
  • 試着改變神經網絡的結構。你可讓神經網絡更大或更小。這對訓練時間或分類準確率有什麼影響?要注意的是,當你改變了神經網絡結構時,就沒法從新載入checkpoints了。
  • 試着在第二個卷積層使用batch-normalization。也試試在倆個層中都刪掉它。
  • 研究一些CIFAR-10上的更好的神經網絡 ,試着實現它們。
  • 向朋友解釋程序如何工做。
相關文章
相關標籤/搜索