TF Boys (TensorFlow Boys ) 養成記（五）： CIFAR10 Model 和 TensorFlow 的四種交叉熵介紹

時間 2019-11-11

標籤 boys tensorflow 養成 cifar10 cifar model 四種交叉介紹简体版

原文原文鏈接

有了數據，有了網絡結構，下面咱們就來寫 cifar10 的代碼。python

首先處理輸入，在 /home/your_name/TensorFlow/cifar10/ 下創建 cifar10_input.py，輸入以下代碼：git

from __future__ import absolute_import        # 絕對導入
from __future__ import division                # 精確除法，/是精確除，//是取整除
from __future__ import print_function        # 打印函數

import os
import tensorflow as tf

# 創建一個 cifar10_data 的類， 輸入文件名隊列，輸出 labels 和images
class cifar10_data(object):

    def __init__(self, filename_queue):        # 類初始化
        
        # 根據上一篇文章介紹的文件格式，定義初始化參數
        self.height = 32
        self.width = 32
        self.depth = 3
        # label 一個字節
        self.label_bytes = 1
        # 圖像 32*32*3 = 3072 字節
        self.image_bytes = self.height * self.width * self.depth
        # 讀取的固定字節長度爲 3072 + 1 = 3073 
        self.record_bytes = self.label_bytes + self.image_bytes
        self.label, self.image = self.read_cifar10(filename_queue)
        
    def read_cifar10(self, filename_queue):

        # 讀取固定長度文件
        reader = tf.FixedLengthRecordReader(record_bytes = self.record_bytes)
        key, value = reader.read(filename_queue)
        record_bytes = tf.decode_raw(value, tf.uint8)
        # tf.slice(record_bytes, 起始位置， 長度)
        label = tf.cast(tf.slice(record_bytes, [0], [self.label_bytes]), tf.int32)
        # 從 label 起，切片 self.image_bytes = 3072 長度爲圖像
        image_raw = tf.slice(record_bytes, [self.label_bytes], [self.image_bytes])
        # 圖片轉化成 3*32*32
        image_raw = tf.reshape(image_raw, [self.depth, self.height, self.width])
        # 圖片轉化成 32*32*3
        image = tf.transpose(image_raw, (1,2,0))        
        image = tf.cast(image, tf.float32)
        return label, image

        
def inputs(data_dir, batch_size, train = True, name = 'input'):

    # 建議加上 tf.name_scope, 能夠畫出漂亮的流程圖。
    with tf.name_scope(name):
        if train: 
            # 要讀取的文件的名字
            filenames = [os.path.join(data_dir,'data_batch_%d.bin' % ii) 
                        for ii in range(1,6)]
            # 不存在該文件的時候報錯
            for f in filenames:
                if not tf.gfile.Exists(f):
                    raise ValueError('Failed to find file: ' + f)
            # 用文件名生成文件名隊列
            filename_queue = tf.train.string_input_producer(filenames)
            # 送入 cifar10_data 類中
            read_input = cifar10_data(filename_queue)
            images = read_input.image
            # 圖像白化操做，因爲網絡結構簡單，不加這句正確率很低。
            images = tf.image.per_image_whitening(images)
            labels = read_input.label
            # 生成 batch 隊列，16 線程操做，容量 20192，min_after_dequeue 是
            # 離隊操做後，隊列中剩餘的最少的元素，確保隊列中一直有 min_after_dequeue
            # 以上元素，建議設置 capacity = min_after_dequeue + batch_size * 3
            num_preprocess_threads = 16
            image, label = tf.train.shuffle_batch(
                                    [images,labels], batch_size = batch_size, 
                                    num_threads = num_preprocess_threads, 
                                    min_after_dequeue = 20000, capacity = 20192)
        
            
            return image, tf.reshape(label, [batch_size])
            
        else:
            filenames = [os.path.join(data_dir,'test_batch.bin')]
            for f in filenames:
                if not tf.gfile.Exists(f):
                    raise ValueError('Failed to find file: ' + f)
                    
            filename_queue = tf.train.string_input_producer(filenames)
            read_input = cifar10_data(filename_queue)
            images = read_input.image
            images = tf.image.per_image_whitening(images)
            labels = read_input.label
            num_preprocess_threads = 16
            image, label = tf.train.shuffle_batch(
                                    [images,labels], batch_size = batch_size, 
                                    num_threads = num_preprocess_threads, 
                                    min_after_dequeue = 20000, capacity = 20192)
        
            
            return image, tf.reshape(label, [batch_size])

在 /home/your_name/TensorFlow/cifar10/ 下創建 cifar10.py，輸入以下代碼github

from __future__ import absolute_import
from __future__ import division from __future__ import print_function import os import os.path import time from datetime import datetime import numpy as np from six.moves import xrange import tensorflow as tf import my_cifar10_input BATCH_SIZE = 64 LEARNING_RATE = 0.1 MAX_STEP = 50000
TRAIN = True


# 用 get_variable 在 CPU 上定義常量
def variable_on_cpu(name, shape, initializer = tf.constant_initializer(0.1)): with tf.device('/cpu:0'): dtype = tf.float32 var = tf.get_variable(name, shape, initializer = initializer, dtype = dtype) return var # 用 get_variable 在 CPU 上定義變量
def variables(name, shape, stddev): dtype = tf.float32 var = variable_on_cpu(name, shape, tf.truncated_normal_initializer(stddev = stddev, dtype = dtype)) return var # 定義網絡結構
def inference(images): with tf.variable_scope('conv1') as scope: # 用 5*5 的卷積核，64 個 Feature maps
        weights = variables('weights', [5,5,3,64], 5e-2) # 卷積，步長爲 1*1
        conv = tf.nn.conv2d(images, weights, [1,1,1,1], padding = 'SAME') biases = variable_on_cpu('biases', [64]) # 加上偏置
        bias = tf.nn.bias_add(conv, biases) # 經過 ReLu 激活函數
        conv1 = tf.nn.relu(bias, name = scope.name) # 柱狀圖總結 conv1
        tf.histogram_summary(scope.name + '/activations', conv1) with tf.variable_scope('pooling1_lrn') as scope: # 最大池化，3*3 的卷積核，2*2 的卷積
        pool1 = tf.nn.max_pool(conv1, ksize = [1,3,3,1], strides = [1,2,2,1], padding = 'SAME', name='pool1') # 局部響應歸一化
        norm1 = tf.nn.lrn(pool1, 4, bias = 1.0, alpha = 0.001/9.0, beta = 0.75, name = 'norm1') with tf.variable_scope('conv2') as scope: weights = variables('weights', [5,5,64,64], 5e-2) conv = tf.nn.conv2d(norm1, weights, [1,1,1,1], padding = 'SAME') biases = variable_on_cpu('biases', [64]) bias = tf.nn.bias_add(conv, biases) conv2 = tf.nn.relu(bias, name = scope.name) tf.histogram_summary(scope.name + '/activations', conv2) with tf.variable_scope('pooling2_lrn') as scope: norm2 = tf.nn.lrn(conv2, 4, bias = 1.0, alpha = 0.001/9.0, beta = 0.75, name = 'norm1') pool2 = tf.nn.max_pool(norm2, ksize = [1,3,3,1], strides = [1,2,2,1], padding = 'SAME', name='pool1') with tf.variable_scope('local3') as scope: # 第一層全鏈接
        reshape = tf.reshape(pool2, [BATCH_SIZE,-1]) dim = reshape.get_shape()[1].value weights = variables('weights', shape=[dim,384], stddev=0.004) biases = variable_on_cpu('biases', [384]) # ReLu 激活函數
        local3 = tf.nn.relu(tf.matmul(reshape, weights)+biases, name = scope.name) # 柱狀圖總結 local3
        tf.histogram_summary(scope.name + '/activations', local3) with tf.variable_scope('local4') as scope: # 第二層全鏈接
        weights = variables('weights', shape=[384,192], stddev=0.004) biases = variable_on_cpu('biases', [192]) local4 = tf.nn.relu(tf.matmul(local3, weights)+biases, name = scope.name) tf.histogram_summary(scope.name + '/activations', local4) with tf.variable_scope('softmax_linear') as scope: # softmax 層，實際上不是嚴格的 softmax ，真正的 softmax 在損失層
        weights = variables('weights', [192, 10], stddev=1/192.0) biases = variable_on_cpu('biases', [10]) softmax_linear = tf.add(tf.matmul(local4, weights), biases, name = scope.name) tf.histogram_summary(scope.name + '/activations', softmax_linear) return softmax_linear # 交叉熵損失層 
def losses(logits, labels): with tf.variable_scope('loss') as scope: labels = tf.cast(labels, tf.int64) # 交叉熵損失，至於爲何是這個函數，後面會說明。
        cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits\ (logits, labels, name='cross_entropy_per_example') loss = tf.reduce_mean(cross_entropy, name = 'loss') tf.scalar_summary(scope.name + '/x_entropy', loss) return loss

如今來看下爲何要用 tf.nn.sparse_softmax_cross_entropy_with_logits 這麼長的一個函數，在官方文檔中，一共有4中交叉熵損失函數：api

1. tf.nn.sigmoid_cross_entropy_with_logits(logits, targets,name=None)網絡

2. tf.nn.softmax_cross_entropy_with_logits(logits, labels,dim=-1, name=None)ide

3. tf.nn.sparse_softmax_cross_entropy_with_logits(logits,labels, name=None)函數

4. tf.nn.weighted_cross_entropy_with_logits(logits, targets,pos_weight, name=None)ui

分別來看一下：編碼

1）第一個函數就是傳統的 sigmoid 交叉熵，假設 x = logits, z = targets，那麼第一個函數的交叉熵損失能夠寫做：spa

z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))

注意，sigmoid 用於二分類，logits 和 targets 維度要相同。

2）第二個函數是 softmax 交叉熵，用於多分類，而且類間相互獨立，不能一個元素既屬於這個類又屬於那個類。而且，也是要求logits 和 targets 維度要相同。

例如，上面的 losses 代碼中目標分爲10類，logits 是 64*10 維度的，而 targets(也就是labels) 是 [64] 維度的，就不能用這個函數，要想使用這個函數，得把 labels 變成 64*10 的 onehot encoding (獨熱編碼)，假設 labels 的 64 個值分別是：[1,5,2,3,0,4,9,8,7,5,6,4,5,8...]，那麼 labels 變成獨熱編碼之後，第一行變成：[0,1,0,0,0,0,0,0,0,0]，第二行變爲：[0,0,0,0,0,1,0,0,0,0]，第三行：[0,0,1,0,0,0,0,0,0,0]，也就是：每行的第 label 個值變爲1，其餘是0，用代碼能夠以下寫：

targets = np.zeros([64,10], dtype = np.float)
for index, value in enumerate(labels):
    targets[index, value] = 1.0

3）也就是咱們所使用的函數，與第二個函數不一樣的一點是，不要求維度相同，只要求第 0 維相同，若 logits 是 64*10 維度的， targets(也就是labels) 是 [64] 維度的，那麼第 0 個維度相同，就可使用這個函數了，不須要進行 onehot encoding ，從上一篇文章咱們所畫出來的流程圖能夠明顯看出來，loss 層的輸入，一個是 64*10 維，一個是 64 維。而且這個函數，自帶了 softmax 的計算，因此，在 inference 的最後一層，咱們實際上計算的不是真正的 softmax。

4）和第一個函數差很少相同，只是能夠加一個權重 pos_weight，假設 x = logits, z = targets, q = pos_weight，那麼第四個函數的交叉熵損失爲：

  q * z * -log(sigmoid(x)) + (1 - z) * -log(1 - sigmoid(x))
= q * z * -log(1 / (1 + exp(-x))) + (1 - z) * -log(exp(-x) / (1 + exp(-x)))
= q * z * log(1 + exp(-x)) + (1 - z) * (-log(exp(-x)) + log(1 + exp(-x)))
= q * z * log(1 + exp(-x)) + (1 - z) * (x + log(1 + exp(-x))
= (1 - z) * x + (qz +  1 - z) * log(1 + exp(-x))
= (1 - z) * x + (1 + (q - 1) * z) * log(1 + exp(-x))

參考文獻：

1. https://www.tensorflow.org/api_docs/python/nn/classification

2. https://github.com/tensorflow/models/tree/master/tutorials/image/cifar10

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。