深度有趣 | 12 一塊兒來動動手

時間 2019-12-19

標籤深度有趣一塊兒動手简体版

原文原文鏈接

簡介

用TensorFlow實現一個手部實時檢測器python

和Inception-v3經過遷移學習實現定製的圖片分類任務相似git

在上節課內容的基礎上，添加手部標註數據，並使用預訓練好的模型完成遷移學習github

數據

手部檢測數據來自於api

vision.soic.indiana.edu/projects/eg…bash

圖片使用Google Class拍攝，egohands_data.zip是一個壓縮包，裏面共有48個文件夾，分別對應48個不一樣場景（室內、室外、下棋等）中共計4800張標註圖片，標註即所有的手部輪廓點ide

不過咱們不須要手動解壓這個壓縮包，而是使用代碼去完成數據的解壓和整理工做學習

egohands_dataset_clean.py依次完成如下幾項工做測試

若是當前目錄下沒有egohands_data.zip則下載，即調用download_egohands_dataset()
不然解壓egohands_data.zip並獲得egohands文件夾，並對其中的圖片數據執行rename_files()
rename_files()會將全部的圖片重命名，加上其父文件夾的名稱，避免圖片名重複，並調用generate_csv_files()
generate_csv_files()讀取每一個場景下的圖片，調用get_bbox_visualize()，根據標註文件polygons.mat繪製手部輪廓和Anchor Box並顯示，同時將圖片標註轉換並存儲爲csv文件，所有處理完後，再調用split_data_test_eval_train()
split_data_test_eval_train()完成訓練集和測試集的分割，在images文件夾中新建train和test兩個文件夾，分別存放對應的圖片和csv標註
完成以上工做後，即可以手動刪除一開始解壓獲得的egohands文件夾

也就是從egohands_data.zip獲得images文件夾，在個人筆記本上共花費6分鐘左右ui

接下來調用generate_tfrecord.py，將訓練集和測試集整理成TFRecord文件spa

因爲這裏只須要檢測手部，所以物體類別只有一種即hand，若是須要定製其餘物體檢測任務，修改如下代碼便可

def class_text_to_int(row_label):
    if row_label == 'hand':
        return 1
    else:
        None
複製代碼

運行如下兩條命令，生成訓練集和測試集對應的TFRecord文件

python generate_tfrecord.py --csv_input=images/train/train_labels.csv  --output_path=retrain/train.record
複製代碼

python generate_tfrecord.py --csv_input=images/test/test_labels.csv  --output_path=retrain/test.record
複製代碼

模型

依舊是上節課使用的ssd_mobilenet_v1_coco，但這裏只須要檢測手部，因此須要根據定製的標註數據進行遷移學習

retrain文件夾中內容以下

train.record和test.record：定製物體檢測任務的標註數據
ssd_mobilenet_v1_coco_11_06_2017：預訓練好的ssd_mobilenet_v1_coco模型
ssd_mobilenet_v1_coco.config：使用遷移學習訓練模型的配置文件
hand_label_map.pbtxt：指定檢測類別的名稱和編號映射
retrain.py：遷移學習的訓練代碼
object_detection：一些輔助文件

配置文件ssd_mobilenet_v1_coco.config的模版在這裏

github.com/tensorflow/…

按需修改配置文件，主要是包括PATH_TO_BE_CONFIGURED的配置項

num_classes：物體類別的數量，這裏爲1
fine_tune_checkpoint：預訓練好的模型checkpoint文件
train_input_reader：指定訓練數據input_path和映射文件路徑label_map_path
eval_input_reader：指定測試數據input_path和映射文件路徑label_map_path

映射文件hand_label_map.pbtxt內容以下，只有一個類別

item {
    id: 1
    name: 'hand'
}
複製代碼

使用如下命令開始模型的遷移訓練，train_dir爲模型輸出路徑，pipeline_config_path爲配置項路徑

python retrain.py --logtostderr --train_dir=output/ --pipeline_config_path=ssd_mobilenet_v1_coco.config
複製代碼

模型遷移訓練完畢後，在output文件夾中便可看到生成的.data、.index、.meta等模型文件

使用TensorBoard查看模型訓練過程，模型總損失以下

tensorboard --logdir='output'
複製代碼

最後，再使用export_inference_graph.py將模型打包成.pb文件

--pipeline_config_path：配置文件路徑
--trained_checkpoint_prefix：模型checkpoint路徑
--output_directory：.pb文件輸出路徑

python export_inference_graph.py --input_type image_tensor --pipeline_config_path retrain/ssd_mobilenet_v1_coco.config  --trained_checkpoint_prefix retrain/output/model.ckpt-153192 --output_directory hand_detection_inference_graph
複製代碼

運行後會生成文件夾hand_detection_inference_graph，裏面能夠找到一個frozen_inference_graph.pb文件

應用

如今即可以使用訓練好的手部檢測模型，實現一個手部實時檢測器

主要改動如下三行代碼便可

PATH_TO_CKPT = 'hand_detection_inference_graph/frozen_inference_graph.pb'
PATH_TO_LABELS = 'retrain/hand_label_map.pbtxt'
NUM_CLASSES = 1
複製代碼

完整代碼以下

# -*- coding: utf-8 -*-

import numpy as np
import tensorflow as tf

from utils import label_map_util
from utils import visualization_utils as vis_util

import cv2
cap = cv2.VideoCapture(0)

PATH_TO_CKPT = 'hand_detection_inference_graph/frozen_inference_graph.pb'
PATH_TO_LABELS = 'retrain/hand_label_map.pbtxt'
NUM_CLASSES = 1

detection_graph = tf.Graph()
with detection_graph.as_default():
	od_graph_def = tf.GraphDef()
	with tf.gfile.GFile(PATH_TO_CKPT, 'rb') as fid:
		od_graph_def.ParseFromString(fid.read())
		tf.import_graph_def(od_graph_def, name='')

label_map = label_map_util.load_labelmap(PATH_TO_LABELS)
categories = label_map_util.convert_label_map_to_categories(label_map, max_num_classes=NUM_CLASSES, use_display_name=True)
category_index = label_map_util.create_category_index(categories)

with detection_graph.as_default():
	with tf.Session(graph=detection_graph) as sess:
	    image_tensor = detection_graph.get_tensor_by_name('image_tensor:0')
	    detection_boxes = detection_graph.get_tensor_by_name('detection_boxes:0')
	    detection_scores = detection_graph.get_tensor_by_name('detection_scores:0')
	    detection_classes = detection_graph.get_tensor_by_name('detection_classes:0')
	    num_detections = detection_graph.get_tensor_by_name('num_detections:0')
	    while True:
	    	ret, image_np = cap.read()
	    	image_np = cv2.cvtColor(image_np, cv2.COLOR_BGR2RGB)
	    	image_np_expanded = np.expand_dims(image_np, axis=0)
	    	(boxes, scores, classes, num) = sess.run(
	    		[detection_boxes, detection_scores, detection_classes, num_detections], 
	    		feed_dict={image_tensor: image_np_expanded})
	    	
	    	vis_util.visualize_boxes_and_labels_on_image_array(image_np, np.squeeze(boxes), np.squeeze(classes).astype(np.int32), np.squeeze(scores), category_index, use_normalized_coordinates=True, line_thickness=8)
	    	
	    	cv2.imshow('hand detection', cv2.cvtColor(image_np, cv2.COLOR_RGB2BGR))
	    	if cv2.waitKey(25) & 0xFF == ord('q'):
	    		cap.release()
	    		cv2.destroyAllWindows()
	    		break
複製代碼

運行代碼後，便可看到攝像頭中手部檢測的結果

定製檢測任務

若是但願定製本身的檢測任務，準備一些圖片，而後手動標註，有個幾百條就差很少了

使用labelImg進行圖片標註，安裝方法請參考如下連接

github.com/tzutalin/la…

進入labelImg文件夾，使用如下命令，兩個參數分別表示圖片目錄和分類文件路徑

python labelImg.py ../imgs/ ../predefined_classes.txt
複製代碼

標註界面以下圖所示，按w開始矩形的繪製，按Ctrl+S保存標註至xml文件夾

以後運行xml_to_csv.py便可將.xml文件轉爲.csv文件

總之，爲了準備TFRecord數據，按照如下步驟操做

新建train和test文件夾並分配圖片
分別對訓練集和測試集圖片手工標註
將訓練集和測試集對應的多個.xml轉爲一個.csv
根據原始圖片和.csv生成對應的TFRecord

參考

How to Build a Real-time Hand-Detector using Neural Networks (SSD) on Tensorflow：towardsdatascience.com/how-to-buil…
EgoHands - A Dataset for Hands in Complex Egocentric Interactions：vision.soic.indiana.edu/projects/eg…
How to train your own Object Detector with TensorFlow’s Object Detector API：towardsdatascience.com/how-to-trai…

視頻講解課程

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。