dlib下訓練本身的物體檢測器--手的檢測

時間 2019-11-18

標籤 dlib 訓練本身物體檢測器檢測简体版

原文原文鏈接

以前咱們在Linux上安裝了dlib(http://www.cnblogs.com/take-fetter/p/8318602.html),也成功的完成了以前的人臉檢測程序,html

今天咱們來一塊兒學習怎樣使用dlib建立屬於本身的簡單的物體識別器(這裏以手的檢測爲例，特別感謝https://handmap.github.io/dlib-classifier-for-object-detection/)python

imglab的介紹與安裝

imglab是dlib提供的個工具,位於github dlib開源項目的tools目錄下.imglab是一個簡單的圖形工具，用對象邊界來標註圖像盒子和可選的零件位置。通常來講，你能夠在須要時使用它git

以訓練物體檢測器（例如臉部檢測器），由於它容許你輕鬆建立所需的訓練數據集。github

(源碼位於https://github.com/davisking/dlib/tree/master/tools/imglab 若是有興趣使用的話建議先下載整個dlib項目並安裝dlib後再對本工具進行編譯)windows

　　　編譯依次使用工具

 cd dlib/tools/imglab
    mkdir build
    cd build
    cmake ..
    cmake --build . --config Release

　　 不建議使用readme.txt中關於sudo make install的命令,由於我使用以後出現了沒法顯示圖像的錯誤學習

訓練本身的手檢測器（關於手的圖片的dataset能夠參考Hand Images Databases - https://www.mutah.edu.jo/biometrix/hand-images-databases.html提供的數據集

　　　　　　　　　　　　或http://www.robots.ox.ac.uk/~vgg/data/hands/的相關數據集）測試

　　使用cmake後的build文件目錄下（windows則位於release目錄中）完成以下操做動畫

　　使用ui

./imglab -c mydataset.xml 圖片目錄

　　建立mydataset.xml完成建立mydataset.xml 和image_metadata_stylesheet.xsl的樣式表

　　使用

./imglab mydataset.xml

　　會打開一個窗口，這裏就須要對每張圖片進行位置的框選，在Next Label中輸入框選信息，並對每張圖片進行框選（按住shift並鼠標左鍵點擊拖動畫框）

　　在將對圖片全標註後，在files選項中點擊save，咱們即可以關閉窗口，此時打開mydataset.xml能夠看到其中包含了圖片信息，如圖

以後將mydataset.xml 和image_metadata_stylesheet.xsl放入圖片目錄中，運行以下代碼進行訓練（可能會出現圖片目錄出錯的狀況，這裏須要對mydataset.xml中的圖片位置進行確認）

代碼改自dlib的python_examples，若是要本身嘗試，建議先認真看下github中的代碼（https://github.com/davisking/dlib/blob/master/python_examples/train_object_detector.py）

運行程序需使用scikit-image使用pip install scikit-image 安裝

import os
import sys
import glob

import dlib
from skimage import io


# In this example we are going to train a face detector based on the small
# faces dataset in the examples/faces directory.  This means you need to supply
# the path to this faces folder as a command line argument so we will know
# where it is.
if len(sys.argv) != 2:
    print(
        "Give the path to the examples/faces directory as the argument to this "
        "program. For example, if you are in the python_examples folder then "
        "execute this program by running:\n"
        "    ./train_object_detector.py ../examples/faces")
    exit()
faces_folder = sys.argv[1]


# Now let's do the training.  The train_simple_object_detector() function has a
# bunch of options, all of which come with reasonable default values.  The next
# few lines goes over some of these options.
options = dlib.simple_object_detector_training_options()
# Since faces are left/right symmetric we can tell the trainer to train a
# symmetric detector.  This helps it get the most value out of the training
# data.
options.add_left_right_image_flips = True
# The trainer is a kind of support vector machine and therefore has the usual
# SVM C parameter.  In general, a bigger C encourages it to fit the training
# data better but might lead to overfitting.  You must find the best C value
# empirically by checking how well the trained detector works on a test set of
# images you haven't trained on.  Don't just leave the value set at 5.  Try a
# few different C values and see what works best for your data.
options.C = 5
# Tell the code how many CPU cores your computer has for the fastest training.
options.num_threads = 4
options.be_verbose = True


training_xml_path = os.path.join(faces_folder, "palm-landmarks.xml")
testing_xml_path = os.path.join(faces_folder, "testing.xml")
# This function does the actual training.  It will save the final detector to
# detector.svm.  The input is an XML file that lists the images in the training
# dataset and also contains the positions of the face boxes.  To create your
# own XML files you can use the imglab tool which can be found in the
# tools/imglab folder.  It is a simple graphical tool for labeling objects in
# images with boxes.  To see how to use it read the tools/imglab/README.txt
# file.  But for this example, we just use the training.xml file included with
# dlib.
dlib.train_simple_object_detector(training_xml_path, "detector.svm", options)

接下來就是等待訓練完成（固然在這裏說下，數據集不宜過大，會致使內存不足而OS自動殺死線程/進程的狀況），options中的參數不少須要自行根據狀況調節的

訓練完成後會生成detector.svm文件，使用以下程序進行一個簡單的測試：

import imutils
import dlib
import cv2
import time

detector = dlib.simple_object_detector("detector_from_author.svm")

image = cv2.imread('test0.jpg')
image = imutils.resize(image, width=500)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)

rects = detector(gray, 1)
#win_det = dlib.image_window()
#win_det.set_image(detector)

#win = dlib.image_window()

for (k, d) in enumerate(rects):
    print("Detection {}: Left: {} Top: {} Right: {} Bottom: {}".format(
        k, d.left(), d.top(), d.right(), d.bottom()))
    cv2.rectangle(image, (d.left(), d.top()), (d.right(), d.bottom()), (0, 255, 0), 2)

#win.add_overlay(rects)
cv2.imshow("Output", image)
cv2.waitKey(0)

運行結果

能夠看到完成了手的檢測。

後記：

訓練時間很長，但願能耐心等待
再次特別感謝Nathan Glover以及他的教程https://handmap.github.io/dlib-classifier-for-object-detection/
若是要製做精度很高的檢測器，並不建議使用本方法，由於咱們最終生成的svm文件相比於dlib做者的人臉識別檢測器而言相差甚遠。
我認爲dlib提供的imglab功能不多，不適用於大規模的須要高精度的識別狀況（不過人臉識別仍是很不錯的）
對於須要高精度高準確率的物體識別，使用Tensorflow Object Detection API應該更爲合適（https://github.com/tensorflow/models/tree/master/research/object_detection）

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。