學習筆記TF058:人臉識別

時間 2019-11-13

標籤學習筆記 tf058 識別简体版

原文原文鏈接

人臉識別，基於人臉部特徵信息識別身份的生物識別技術。攝像機、攝像頭採集人臉圖像或視頻流，自動檢測、跟蹤圖像中人臉，作臉部相關技術處理，人臉檢測、人臉關鍵點檢測、人臉驗證等。《麻省理工科技評論》(MIT Technology Review)，2017年全球十大突破性技術榜單，支付寶「刷臉支付」(Paying with Your Face)入圍。html

人臉識別優點，非強制性(採集方式不容易被察覺，被識別人臉圖像可主動獲取)、非接觸性(用戶不須要與設備接觸)、併發性(可同時多人臉檢測、跟蹤、識別)。深度學習前，人臉識別兩步驟：高維人工特徵提取、降維。傳統人臉識別技術基於可見光圖像。深度學習+大數據(海量有標註人臉數據)爲人臉識別領域主流技術路線。神經網絡人臉識別技術，大量樣本圖像訓練識別模型，無需人工選取特徵，樣本訓練過程自行學習，識別準確率能夠達到99%。python

人臉識別技術流程。git

人臉圖像採集、檢測。人臉圖像採集，攝像頭把人臉圖像採集下來，靜態圖像、動態圖像、不一樣位置、不一樣表情。用戶在採集設備拍報範圍內，採集設置自動搜索並拍攝。人臉檢測屬於目標檢測(object detection)。對要檢測目標對象機率統計，獲得待檢測對象特徵，創建目標檢測模型。用模型匹配輸入圖像，輸出匹配區域。人臉檢測是人臉識別預處理，準確標定人臉在圖像的位置大小。人臉圖像模式特徵豐富，直方圖特徵、顏色特徵、模板特徵、結構特徵、哈爾特徵(Haar-like feature)。人臉檢測挑出有用信息，用特徵檢測人臉。人臉檢測算法，模板匹配模型、Adaboost模型，Adaboost模型速度。精度綜合性能最好，訓練慢、檢測快，可達到視頻流實時檢測效果。github

人臉圖像預處理。基於人臉檢測結果，處理圖像，服務特徵提取。系統獲取人臉圖像受到各類條件限制、隨機干擾，需縮放、旋轉、拉伸、光線補償、灰度變換、直方圖均衡化、規範化、幾何校訂、過濾、銳化等圖像預處理。算法

人臉圖像特徵提取。人臉圖像信息數字化，人臉圖像轉變爲一串數字(特徵向量)。如，眼睛左邊、嘴脣右邊、鼻子、下巴位置，特徵點間歐氏距離、曲率、角度提取出特徵份量，相關特徵鏈接成長特徵向量。數據庫

人臉圖像匹配、識別。提取人臉圖像特徵數據與數據庫存儲人臉特徵模板搜索匹配，根據類似程度對身份信息進行判斷，設定閾值，類似度越過閾值，輸出匹配結果。確認，一對一(1:1)圖像比較，證實「你就是你」，金融覈實身份、信息安全領域。辨認，一對多(1:N)圖像匹配，「N人中找你」，視頻流，人走進識別範圍就完成識別，安防領域。json

人臉識別分類。安全

人臉檢測。檢測、定位圖片人臉，返回高業餓呀人臉框座標。對人臉分析、處理的第一步。「滑動窗口」，選擇圖像矩形區域做滑動窗口，窗口中提取特徵對圖像區域描述，根據特徵描述判斷窗口是否人臉。不斷遍歷須要觀察窗口。微信

人臉關鍵點檢測。定位、返回人臉五官、輪廓關鍵點座標位置。人臉輪廓、眼睛、眉毛、嘴脣、鼻子輪廓。Face++提供高達106點關鍵點。人臉關鍵點定位技術，級聯形迴歸(cascaded shape regression, CSR)。人臉識別，基於DeepID網絡結構。DeepID網絡結構相似卷積神經網絡結構，倒數第二層，有DeepID層，與卷積層4､最大池化層3相連，卷積神經網絡層數越高視野域越大，既考慮局部特徵，又考慮全局特徵。輸入層 31x39x1､卷積層1 28x36x20(卷積核4x4x1)、最大池化層1 12x18x20(過濾器2x2)、卷積層2 12x16x20(卷積核3x3x20)、最大池化層2 6x8x40(過濾器2x2)、卷積層3 4x6x60(卷積核3x3x40)、最大池化層2 2x3x60(過濾器2x2)、卷積層4 2x2x80(卷積核2x2x60)、DeepID層 1x160、全鏈接層 Softmax。《Deep Learning Face Representation from Predicting 10000 Classes》 http://mmlab.ie.cuhk.edu.hk/pdf/YiSun_CVPR14.pdf 。網絡

人臉驗證。分析兩張人臉同一人可能性大小。輸入兩張人臉，獲得置信度分類、相應閾值，評估類似度。

人臉屬性檢測。人臉屬性辯識、人臉情緒分析。https://www.betaface.com/wpa/ 在線人臉識別測試。給出人年齡、是否有鬍子、情緒(高興、正常、生氣、憤怒)、性別、是否帶眼鏡、膚色。

人臉識別應用，美圖秀秀美顏應用、世紀佳緣查看潛在配偶「面相」類似度，支付領域「刷臉支付」，安防領域「人臉鑑權」。Face++、商湯科技，提供人臉識別SDK。

人臉檢測。https://github.com/davidsandberg/facenet 。

Florian Schroff、Dmitry Kalenichenko、James Philbin論文《FaceNet: A Unified Embedding for Face Recognition and Clustering》 https://arxiv.org/abs/1503.03832 。https://github.com/davidsandberg/facenet/wiki/Validate-on-lfw 。

LFW(Labeled Faces in the Wild Home)數據集。http://vis-www.cs.umass.edu/lfw/ 。美國馬薩諸塞大學阿姆斯特分校計算機視覺實驗室整理。13233張圖片，5749人。4096人只有一張圖片，1680人多於一張。每張圖片尺寸250x250。人臉圖片在每一個人物名字文件夾下。

數據預處理。校準代碼 https://github.com/davidsandberg/facenet/blob/master/src/align/align_dataset_mtcnn.py 。檢測所用數據集校準爲和預訓練模型所用數據集大小一致。設置環境變量

export PYTHONPATH=[...]/facenet/src

校準命令

for N in {1..4}; do python src/align/align_dataset_mtcnn.py ~/datasets/lfw/raw ~/datasets/lfw/lfw_mtcnnpy_160 --image_size 160 --margin 32 --random_order --gpu_memory_fraction 0.25 & done

預訓練模型20170216-091149.zip https://drive.google.com/file/d/0B5MzpY9kBtDVZ2RpVDYwWmxoSUk 。訓練集 MS-Celeb-1M數據集 https://www.microsoft.com/en-us/research/project/ms-celeb-1m-challenge-recognizing-one-million-celebrities-real-world/ 。微軟人臉識別數據庫，名人榜選擇前100萬名人，搜索引擎採集每一個名人100張人臉圖片。預訓練模型準確率0.993+-0.004。

檢測。python src/validate_on_lfw.py datasets/lfw/lfw_mtcnnpy_160 models 基準比較，採用facenet/data/pairs.txt，官方隨機生成數據，匹配和不匹配人名和圖片編號。

十折交叉驗證(10-fold cross validation)，精度測試方法。數據集分紅10份，輪流將其中9份作訓練集，1份作測試保，10次結果均值做算法精度估計。通常須要屢次10折交叉驗證求均值。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np
import argparse
import facenet
import lfw
import os
import sys
import math
from sklearn import metrics
from scipy.optimize import brentq
from scipy import interpolate

def main(args):
    with tf.Graph().as_default():
        with tf.Session() as sess:
        
            # Read the file containing the pairs used for testing
            # 1. 讀入以前的pairs.txt文件
            # 讀入後如[['Abel_Pacheco','1','4']]
            pairs = lfw.read_pairs(os.path.expanduser(args.lfw_pairs))
            # Get the paths for the corresponding images
            # 獲取文件路徑和是否匹配關係對
            paths, actual_issame = lfw.get_paths(os.path.expanduser(args.lfw_dir), pairs, args.lfw_file_ext)
            # Load the model
            # 2. 加載模型
            facenet.load_model(args.model)
        
            # Get input and output tensors
            # 獲取輸入輸出張量
            images_placeholder = tf.get_default_graph().get_tensor_by_name("input:0")
            embeddings = tf.get_default_graph().get_tensor_by_name("embeddings:0")
            phase_train_placeholder = tf.get_default_graph().get_tensor_by_name("phase_train:0")
        
            #image_size = images_placeholder.get_shape()[1]  # For some reason this doesn't work for frozen graphs
            image_size = args.image_size
            embedding_size = embeddings.get_shape()[1]
    
            # Run forward pass to calculate embeddings
            # 3. 使用前向傳播驗證
            print('Runnning forward pass on LFW images')
            batch_size = args.lfw_batch_size
            nrof_images = len(paths)
            nrof_batches = int(math.ceil(1.0*nrof_images / batch_size)) # 總共批次數
            emb_array = np.zeros((nrof_images, embedding_size))
            for i in range(nrof_batches):
                start_index = i*batch_size
                end_index = min((i+1)*batch_size, nrof_images)
                paths_batch = paths[start_index:end_index]
                images = facenet.load_data(paths_batch, False, False, image_size)
                feed_dict = { images_placeholder:images, phase_train_placeholder:False }
                emb_array[start_index:end_index,:] = sess.run(embeddings, feed_dict=feed_dict)

            # 4. 計算準確率、驗證率，十折交叉驗證方法
            tpr, fpr, accuracy, val, val_std, far = lfw.evaluate(emb_array, 
                actual_issame, nrof_folds=args.lfw_nrof_folds)
            print('Accuracy: %1.3f+-%1.3f' % (np.mean(accuracy), np.std(accuracy)))
            print('Validation rate: %2.5f+-%2.5f @ FAR=%2.5f' % (val, val_std, far))
            # 獲得auc值
            auc = metrics.auc(fpr, tpr)
            print('Area Under Curve (AUC): %1.3f' % auc)
            # 1獲得錯誤率(eer)
            eer = brentq(lambda x: 1. - x - interpolate.interp1d(fpr, tpr)(x), 0., 1.)
            print('Equal Error Rate (EER): %1.3f' % eer)
        
def parse_arguments(argv):
    parser = argparse.ArgumentParser()

    parser.add_argument('lfw_dir', type=str,
        help='Path to the data directory containing aligned LFW face patches.')
    parser.add_argument('--lfw_batch_size', type=int,
        help='Number of images to process in a batch in the LFW test set.', default=100)
    parser.add_argument('model', type=str, 
        help='Could be either a directory containing the meta_file and ckpt_file or a model protobuf (.pb) file')
    parser.add_argument('--image_size', type=int,
        help='Image size (height, width) in pixels.', default=160)
    parser.add_argument('--lfw_pairs', type=str,
        help='The file containing the pairs to use for validation.', default='data/pairs.txt')
    parser.add_argument('--lfw_file_ext', type=str,
        help='The file extension for the LFW dataset.', default='png', choices=['jpg', 'png'])
    parser.add_argument('--lfw_nrof_folds', type=int,
        help='Number of folds to use for cross validation. Mainly used for testing.', default=10)
    return parser.parse_args(argv)
if __name__ == '__main__':
    main(parse_arguments(sys.argv[1:]))

性別、年齡識別。https://github.com/dpressel/rude-carnie 。

Adience 數據集。http://www.openu.ac.il/home/hassner/Adience/data.html#agegender 。26580張圖片，2284類，年齡範圍8個區段(0~2､4~六、8~1三、15~20、25~3二、38~4三、48~5三、60~)，含有噪聲、姿式、光照變化。aligned # 通過剪裁對齊數據，faces # 原始數據。fold_0_data.txt至fold_4_data.txt 所有數據標記。fold_frontal_0_data.txt至fold_frontal_4_data.txt 僅用近似正面姿態面部標記。數據結構 user_id 用戶Flickr賬戶ID、original_image 圖片文件名、face_id 人標識符、age、gender、x、y、dx、dy 人臉邊框、tilt_ang 切斜角度、fiducial_yaw_angle 基準偏移角度、fiducial_score 基準分數。https://www.flickr.com/

數據預處理。腳本把數據處理成TFRecords格式。https://github.com/dpressel/rude-carnie/blob/master/preproc.py 。https://github.com/GilLevi/AgeGenderDeepLearning/tree/master/Folds文件夾，已經對訓練集、測試集劃分、標註。gender_train.txt、gender_val.txt 圖片列表 Adience 數據集處理TFRecords文件。圖片處理爲大小256x256 JPEG編碼RGB圖像。tf.python_io.TFRecordWriter寫入TFRecords文件，輸出文件output_file。

構建模型。年齡、性別訓練模型，Gil Levi、Tal Hassner論文《Age and Gender Classification Using Convolutional Neural Networks》http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.722.9654&rank=1 。模型 https://github.com/dpressel/rude-carnie/blob/master/model.py 。tenforflow.contrib.slim。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import time
import os
import numpy as np
import tensorflow as tf
from data import distorted_inputs
import re
from tensorflow.contrib.layers import *
from tensorflow.contrib.slim.python.slim.nets.inception_v3 import inception_v3_base
TOWER_NAME = 'tower'
def select_model(name):
    if name.startswith('inception'):
        print('selected (fine-tuning) inception model')
        return inception_v3
    elif name == 'bn':
        print('selected batch norm model')
        return levi_hassner_bn
    print('selected default model')
    return levi_hassner
def get_checkpoint(checkpoint_path, requested_step=None, basename='checkpoint'):
    if requested_step is not None:
        model_checkpoint_path = '%s/%s-%s' % (checkpoint_path, basename, requested_step)
        if os.path.exists(model_checkpoint_path) is None:
            print('No checkpoint file found at [%s]' % checkpoint_path)
            exit(-1)
            print(model_checkpoint_path)
        print(model_checkpoint_path)
        return model_checkpoint_path, requested_step
    ckpt = tf.train.get_checkpoint_state(checkpoint_path)
    if ckpt and ckpt.model_checkpoint_path:
        # Restore checkpoint as described in top of this program
        print(ckpt.model_checkpoint_path)
        global_step = ckpt.model_checkpoint_path.split('/')[-1].split('-')[-1]
        return ckpt.model_checkpoint_path, global_step
    else:
        print('No checkpoint file found at [%s]' % checkpoint_path)
        exit(-1)
def _activation_summary(x):
    tensor_name = re.sub('%s_[0-9]*/' % TOWER_NAME, '', x.op.name)
    tf.summary.histogram(tensor_name + '/activations', x)
    tf.summary.scalar(tensor_name + '/sparsity', tf.nn.zero_fraction(x))
def inception_v3(nlabels, images, pkeep, is_training):
    batch_norm_params = {
        "is_training": is_training,
        "trainable": True,
        # Decay for the moving averages.
        "decay": 0.9997,
        # Epsilon to prevent 0s in variance.
        "epsilon": 0.001,
        # Collection containing the moving mean and moving variance.
        "variables_collections": {
            "beta": None,
            "gamma": None,
            "moving_mean": ["moving_vars"],
            "moving_variance": ["moving_vars"],
        }
    }
    weight_decay = 0.00004
    stddev=0.1
    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
    with tf.variable_scope("InceptionV3", "InceptionV3", [images]) as scope:
        with tf.contrib.slim.arg_scope(
                [tf.contrib.slim.conv2d, tf.contrib.slim.fully_connected],
                weights_regularizer=weights_regularizer,
                trainable=True):
            with tf.contrib.slim.arg_scope(
                    [tf.contrib.slim.conv2d],
                    weights_initializer=tf.truncated_normal_initializer(stddev=stddev),
                    activation_fn=tf.nn.relu,
                    normalizer_fn=batch_norm,
                    normalizer_params=batch_norm_params):
                net, end_points = inception_v3_base(images, scope=scope)
                with tf.variable_scope("logits"):
                    shape = net.get_shape()
                    net = avg_pool2d(net, shape[1:3], padding="VALID", scope="pool")
                    net = tf.nn.dropout(net, pkeep, name='droplast')
                    net = flatten(net, scope="flatten")

    with tf.variable_scope('output') as scope:
    
        weights = tf.Variable(tf.truncated_normal([2048, nlabels], mean=0.0, stddev=0.01), name='weights')
        biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
        output = tf.add(tf.matmul(net, weights), biases, name=scope.name)
        _activation_summary(output)
    return output
def levi_hassner_bn(nlabels, images, pkeep, is_training):
    batch_norm_params = {
        "is_training": is_training,
        "trainable": True,
        # Decay for the moving averages.
        "decay": 0.9997,
        # Epsilon to prevent 0s in variance.
        "epsilon": 0.001,
        # Collection containing the moving mean and moving variance.
        "variables_collections": {
            "beta": None,
            "gamma": None,
            "moving_mean": ["moving_vars"],
            "moving_variance": ["moving_vars"],
        }
    }
    weight_decay = 0.0005
    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
    with tf.variable_scope("LeviHassnerBN", "LeviHassnerBN", [images]) as scope:
        with tf.contrib.slim.arg_scope(
                [convolution2d, fully_connected],
                weights_regularizer=weights_regularizer,
                biases_initializer=tf.constant_initializer(1.),
                weights_initializer=tf.random_normal_initializer(stddev=0.005),
                trainable=True):
            with tf.contrib.slim.arg_scope(
                    [convolution2d],
                    weights_initializer=tf.random_normal_initializer(stddev=0.01),
                    normalizer_fn=batch_norm,
                    normalizer_params=batch_norm_params):
                conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
                pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
                conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2') 
                pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
                conv3 = convolution2d(pool2, 384, [3, 3], [1, 1], padding='SAME', biases_initializer=tf.constant_initializer(0.), scope='conv3')
                pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
                # can use tf.contrib.layer.flatten
                flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
                full1 = fully_connected(flat, 512, scope='full1')
                drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
                full2 = fully_connected(drop1, 512, scope='full2')
                drop2 = tf.nn.dropout(full2, pkeep, name='drop2')
    with tf.variable_scope('output') as scope:
    
        weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
        biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
        output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)
    return output
def levi_hassner(nlabels, images, pkeep, is_training):
    weight_decay = 0.0005
    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
    with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope:
        with tf.contrib.slim.arg_scope(
                [convolution2d, fully_connected],
                weights_regularizer=weights_regularizer,
                biases_initializer=tf.constant_initializer(1.),
                weights_initializer=tf.random_normal_initializer(stddev=0.005),
                trainable=True):
            with tf.contrib.slim.arg_scope(
                    [convolution2d],
                    weights_initializer=tf.random_normal_initializer(stddev=0.01)):
                conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
                pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
                norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name='norm1')
                conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2') 
                pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
                norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name='norm2')
                conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding='SAME', scope='conv3')
                pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
                flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
                full1 = fully_connected(flat, 512, scope='full1')
                drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
                full2 = fully_connected(drop1, 512, scope='full2')
                drop2 = tf.nn.dropout(full2, pkeep, name='drop2')
    with tf.variable_scope('output') as scope:
    
        weights = tf.Variable(tf.random_normal([512, nlabels], mean=0.0, stddev=0.01), name='weights')
        biases = tf.Variable(tf.constant(0.0, shape=[nlabels], dtype=tf.float32), name='biases')
        output = tf.add(tf.matmul(drop2, weights), biases, name=scope.name)
    return output

訓練模型。https://github.com/dpressel/rude-carnie/blob/master/train.py 。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from six.moves import xrange
from datetime import datetime
import time
import os
import numpy as np
import tensorflow as tf
from data import distorted_inputs
from model import select_model
import json
import re
LAMBDA = 0.01
MOM = 0.9
tf.app.flags.DEFINE_string('pre_checkpoint_path', '',
                           """If specified, restore this pretrained model """
                           """before beginning any training.""")
tf.app.flags.DEFINE_string('train_dir', '/home/dpressel/dev/work/AgeGenderDeepLearning/Folds/tf/test_fold_is_0',
                           'Training directory')
tf.app.flags.DEFINE_boolean('log_device_placement', False,
                            """Whether to log device placement.""")
tf.app.flags.DEFINE_integer('num_preprocess_threads', 4,
                            'Number of preprocessing threads')
tf.app.flags.DEFINE_string('optim', 'Momentum',
                           'Optimizer')
tf.app.flags.DEFINE_integer('image_size', 227,
                            'Image size')
tf.app.flags.DEFINE_float('eta', 0.01,
                          'Learning rate')
tf.app.flags.DEFINE_float('pdrop', 0.,
                          'Dropout probability')
tf.app.flags.DEFINE_integer('max_steps', 40000,
                          'Number of iterations')
tf.app.flags.DEFINE_integer('steps_per_decay', 10000,
                            'Number of steps before learning rate decay')
tf.app.flags.DEFINE_float('eta_decay_rate', 0.1,
                          'Learning rate decay')
tf.app.flags.DEFINE_integer('epochs', -1,
                            'Number of epochs')
tf.app.flags.DEFINE_integer('batch_size', 128,
                            'Batch size')
tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
                          'Checkpoint name')
tf.app.flags.DEFINE_string('model_type', 'default',
                           'Type of convnet')
tf.app.flags.DEFINE_string('pre_model',
                            '',#'./inception_v3.ckpt',
                           'checkpoint file')
FLAGS = tf.app.flags.FLAGS
# Every 5k steps cut learning rate in half
def exponential_staircase_decay(at_step=10000, decay_rate=0.1):
    print('decay [%f] every [%d] steps' % (decay_rate, at_step))
    def _decay(lr, global_step):
        return tf.train.exponential_decay(lr, global_step,
                                          at_step, decay_rate, staircase=True)
    return _decay
def optimizer(optim, eta, loss_fn, at_step, decay_rate):
    global_step = tf.Variable(0, trainable=False)
    optz = optim
    if optim == 'Adadelta':
        optz = lambda lr: tf.train.AdadeltaOptimizer(lr, 0.95, 1e-6)
        lr_decay_fn = None
    elif optim == 'Momentum':
        optz = lambda lr: tf.train.MomentumOptimizer(lr, MOM)
        lr_decay_fn = exponential_staircase_decay(at_step, decay_rate)
    return tf.contrib.layers.optimize_loss(loss_fn, global_step, eta, optz, clip_gradients=4., learning_rate_decay_fn=lr_decay_fn)
def loss(logits, labels):
    labels = tf.cast(labels, tf.int32)
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits, labels=labels, name='cross_entropy_per_example')
    cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
    tf.add_to_collection('losses', cross_entropy_mean)
    losses = tf.get_collection('losses')
    regularization_losses = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES)
    total_loss = cross_entropy_mean + LAMBDA * sum(regularization_losses)
    tf.summary.scalar('tl (raw)', total_loss)
    #total_loss = tf.add_n(losses + regularization_losses, name='total_loss')
    loss_averages = tf.train.ExponentialMovingAverage(0.9, name='avg')
    loss_averages_op = loss_averages.apply(losses + [total_loss])
    for l in losses + [total_loss]:
        tf.summary.scalar(l.op.name + ' (raw)', l)
        tf.summary.scalar(l.op.name, loss_averages.average(l))
    with tf.control_dependencies([loss_averages_op]):
        total_loss = tf.identity(total_loss)
    return total_loss
def main(argv=None):
    with tf.Graph().as_default():
        model_fn = select_model(FLAGS.model_type)
        # Open the metadata file and figure out nlabels, and size of epoch
        # 打開元數據文件md.json，這個文件是在預處理數據時生成。找出nlabels、epoch大小
        input_file = os.path.join(FLAGS.train_dir, 'md.json')
        print(input_file)
        with open(input_file, 'r') as f:
            md = json.load(f)
        images, labels, _ = distorted_inputs(FLAGS.train_dir, FLAGS.batch_size, FLAGS.image_size, FLAGS.num_preprocess_threads)
        logits = model_fn(md['nlabels'], images, 1-FLAGS.pdrop, True)
        total_loss = loss(logits, labels)
        train_op = optimizer(FLAGS.optim, FLAGS.eta, total_loss, FLAGS.steps_per_decay, FLAGS.eta_decay_rate)
        saver = tf.train.Saver(tf.global_variables())
        summary_op = tf.summary.merge_all()
        sess = tf.Session(config=tf.ConfigProto(
            log_device_placement=FLAGS.log_device_placement))
        tf.global_variables_initializer().run(session=sess)
        # This is total hackland, it only works to fine-tune iv3
        # 本例能夠輸入預訓練模型Inception V3，可用來微調 Inception V3
        if FLAGS.pre_model:
            inception_variables = tf.get_collection(
                tf.GraphKeys.VARIABLES, scope="InceptionV3")
            restorer = tf.train.Saver(inception_variables)
            restorer.restore(sess, FLAGS.pre_model)
        if FLAGS.pre_checkpoint_path:
            if tf.gfile.Exists(FLAGS.pre_checkpoint_path) is True:
                print('Trying to restore checkpoint from %s' % FLAGS.pre_checkpoint_path)
                restorer = tf.train.Saver()
                tf.train.latest_checkpoint(FLAGS.pre_checkpoint_path)
                print('%s: Pre-trained model restored from %s' %
                      (datetime.now(), FLAGS.pre_checkpoint_path))
        # 將ckpt文件存儲在run-(pid)目錄
        run_dir = '%s/run-%d' % (FLAGS.train_dir, os.getpid())
        checkpoint_path = '%s/%s' % (run_dir, FLAGS.checkpoint)
        if tf.gfile.Exists(run_dir) is False:
            print('Creating %s' % run_dir)
            tf.gfile.MakeDirs(run_dir)
        tf.train.write_graph(sess.graph_def, run_dir, 'model.pb', as_text=True)
        tf.train.start_queue_runners(sess=sess)
        summary_writer = tf.summary.FileWriter(run_dir, sess.graph)
        steps_per_train_epoch = int(md['train_counts'] / FLAGS.batch_size)
        num_steps = FLAGS.max_steps if FLAGS.epochs < 1 else FLAGS.epochs * steps_per_train_epoch
        print('Requested number of steps [%d]' % num_steps)
    
        for step in xrange(num_steps):
            start_time = time.time()
            _, loss_value = sess.run([train_op, total_loss])
            duration = time.time() - start_time
            assert not np.isnan(loss_value), 'Model diverged with loss = NaN'
            # 每10步記錄一次摘要文件，保存一個檢查點文件
            if step % 10 == 0:
                num_examples_per_step = FLAGS.batch_size
                examples_per_sec = num_examples_per_step / duration
                sec_per_batch = float(duration)
            
                format_str = ('%s: step %d, loss = %.3f (%.1f examples/sec; %.3f ' 'sec/batch)')
                print(format_str % (datetime.now(), step, loss_value,
                                    examples_per_sec, sec_per_batch))
            # Loss only actually evaluated every 100 steps?
            if step % 100 == 0:
                summary_str = sess.run(summary_op)
                summary_writer.add_summary(summary_str, step)
            
            if step % 1000 == 0 or (step + 1) == num_steps:
                saver.save(sess, checkpoint_path, global_step=step)
if __name__ == '__main__':
    tf.app.run()

驗證模型。https://github.com/dpressel/rude-carnie/blob/master/guess.py 。

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from datetime import datetime
import math
import time
from data import inputs
import numpy as np
import tensorflow as tf
from model import select_model, get_checkpoint
from utils import *
import os
import json
import csv
RESIZE_FINAL = 227
GENDER_LIST =['M','F']
AGE_LIST = ['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)']
MAX_BATCH_SZ = 128
tf.app.flags.DEFINE_string('model_dir', '',
                           'Model directory (where training data lives)')
tf.app.flags.DEFINE_string('class_type', 'age',
                           'Classification type (age|gender)')
tf.app.flags.DEFINE_string('device_id', '/cpu:0',
                           'What processing unit to execute inference on')
tf.app.flags.DEFINE_string('filename', '',
                           'File (Image) or File list (Text/No header TSV) to process')
tf.app.flags.DEFINE_string('target', '',
                           'CSV file containing the filename processed along with best guess and score')
tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
                          'Checkpoint basename')
tf.app.flags.DEFINE_string('model_type', 'default',
                           'Type of convnet')
tf.app.flags.DEFINE_string('requested_step', '', 'Within the model directory, a requested step to restore e.g., 9000')
tf.app.flags.DEFINE_boolean('single_look', False, 'single look at the image or multiple crops')
tf.app.flags.DEFINE_string('face_detection_model', '', 'Do frontal face detection with model specified')
tf.app.flags.DEFINE_string('face_detection_type', 'cascade', 'Face detection model type (yolo_tiny|cascade)')
FLAGS = tf.app.flags.FLAGS
def one_of(fname, types):
    return any([fname.endswith('.' + ty) for ty in types])
def resolve_file(fname):
    if os.path.exists(fname): return fname
    for suffix in ('.jpg', '.png', '.JPG', '.PNG', '.jpeg'):
        cand = fname + suffix
        if os.path.exists(cand):
            return cand
    return None
def classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer):
    try:
        num_batches = math.ceil(len(image_files) / MAX_BATCH_SZ)
        pg = ProgressBar(num_batches)
        for j in range(num_batches):
            start_offset = j * MAX_BATCH_SZ
            end_offset = min((j + 1) * MAX_BATCH_SZ, len(image_files))
        
            batch_image_files = image_files[start_offset:end_offset]
            print(start_offset, end_offset, len(batch_image_files))
            image_batch = make_multi_image_batch(batch_image_files, coder)
            batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})
            batch_sz = batch_results.shape[0]
            for i in range(batch_sz):
                output_i = batch_results[i]
                best_i = np.argmax(output_i)
                best_choice = (label_list[best_i], output_i[best_i])
                print('Guess @ 1 %s, prob = %.2f' % best_choice)
                if writer is not None:
                    f = batch_image_files[i]
                    writer.writerow((f, best_choice[0], '%.2f' % best_choice[1]))
            pg.update()
        pg.done()
    except Exception as e:
        print(e)
        print('Failed to run all images')
def classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer):
    try:
        print('Running file %s' % image_file)
        image_batch = make_multi_crop_batch(image_file, coder)
        batch_results = sess.run(softmax_output, feed_dict={images:image_batch.eval()})
        output = batch_results[0]
        batch_sz = batch_results.shape[0]

        for i in range(1, batch_sz):
            output = output + batch_results[i]
    
        output /= batch_sz
        best = np.argmax(output) # 最可能性能分類
        best_choice = (label_list[best], output[best])
        print('Guess @ 1 %s, prob = %.2f' % best_choice)

        nlabels = len(label_list)
        if nlabels > 2:
            output[best] = 0
            second_best = np.argmax(output)
            print('Guess @ 2 %s, prob = %.2f' % (label_list[second_best], output[second_best]))
        if writer is not None:
            writer.writerow((image_file, best_choice[0], '%.2f' % best_choice[1]))
    except Exception as e:
        print(e)
        print('Failed to run image %s ' % image_file)
def list_images(srcfile):
    with open(srcfile, 'r') as csvfile:
        delim = ',' if srcfile.endswith('.csv') else '\t'
        reader = csv.reader(csvfile, delimiter=delim)
        if srcfile.endswith('.csv') or srcfile.endswith('.tsv'):
            print('skipping header')
            _ = next(reader)
    
        return [row[0] for row in reader]
def main(argv=None):  # pylint: disable=unused-argument
    files = []

    if FLAGS.face_detection_model:
        print('Using face detector (%s) %s' % (FLAGS.face_detection_type, FLAGS.face_detection_model))
        face_detect = face_detection_model(FLAGS.face_detection_type, FLAGS.face_detection_model)
        face_files, rectangles = face_detect.run(FLAGS.filename)
        print(face_files)
        files += face_files
    config = tf.ConfigProto(allow_soft_placement=True)
    with tf.Session(config=config) as sess:
        label_list = AGE_LIST if FLAGS.class_type == 'age' else GENDER_LIST
        nlabels = len(label_list)
        print('Executing on %s' % FLAGS.device_id)
        model_fn = select_model(FLAGS.model_type)
        with tf.device(FLAGS.device_id):
        
            images = tf.placeholder(tf.float32, [None, RESIZE_FINAL, RESIZE_FINAL, 3])
            logits = model_fn(nlabels, images, 1, False)
            init = tf.global_variables_initializer()
        
            requested_step = FLAGS.requested_step if FLAGS.requested_step else None
    
            checkpoint_path = '%s' % (FLAGS.model_dir)
            model_checkpoint_path, global_step = get_checkpoint(checkpoint_path, requested_step, FLAGS.checkpoint)
        
            saver = tf.train.Saver()
            saver.restore(sess, model_checkpoint_path)
                    
            softmax_output = tf.nn.softmax(logits)
            coder = ImageCoder()
            # Support a batch mode if no face detection model
            if len(files) == 0:
                if (os.path.isdir(FLAGS.filename)):
                    for relpath in os.listdir(FLAGS.filename):
                        abspath = os.path.join(FLAGS.filename, relpath)
                    
                        if os.path.isfile(abspath) and any([abspath.endswith('.' + ty) for ty in ('jpg', 'png', 'JPG', 'PNG', 'jpeg')]):
                            print(abspath)
                            files.append(abspath)
                else:
                    files.append(FLAGS.filename)
                    # If it happens to be a list file, read the list and clobber the files
                    if any([FLAGS.filename.endswith('.' + ty) for ty in ('csv', 'tsv', 'txt')]):
                        files = list_images(FLAGS.filename)
            
            writer = None
            output = None
            if FLAGS.target:
                print('Creating output file %s' % FLAGS.target)
                output = open(FLAGS.target, 'w')
                writer = csv.writer(output)
                writer.writerow(('file', 'label', 'score'))
            image_files = list(filter(lambda x: x is not None, [resolve_file(f) for f in files]))
            print(image_files)
            if FLAGS.single_look:
                classify_many_single_crop(sess, label_list, softmax_output, coder, images, image_files, writer)
            else:
                for image_file in image_files:
                    classify_one_multi_crop(sess, label_list, softmax_output, coder, images, image_file, writer)
            if output is not None:
                output.close()
    
if __name__ == '__main__':
    tf.app.run()

微軟臉部圖片識別性別、年齡網站 http://how-old.net/ 。圖片識別年齡、性別。根據問題搜索圖片。

參考資料：《TensorFlow技術解析與實戰》

歡迎推薦上海機器學習工做機會，個人微信：qingxingfengzi