今天偷點兒懶,就沒有及時整理最新的paper,還請各位看官諒解。這裏整理了一份前段時間作的小demo,實現獻醜了python
本文基於OpenCV3.3.1或以上版本(如OpenCV3.4)、DNN模塊和face_detector示例實現簡單、實時的人臉檢測。ios
往期回顧git
[計算機視覺] 入門學習資料github
[計算機視覺論文速遞] 2018-03-20算法
[計算機視覺論文速遞] 2018-03-18網絡
注:框架
[1]:主要參考Face detection with OpenCV and deep learning這個英文教程,並做部分修改。ide
[2]:親測OpenCV3.3.0及如下版本,並無face_detector示例,且不支持face_detector。爲了不折騰,仍是建議使用OpenCV3.3.1及以上(如OpenCV3.4)。oop
1 face_detector簡介學習
face_detector示例連接:
https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector
當電腦配置好OpenCV3.3.1或以上版本時,在opencv\samples\dnn也能夠找到face_detector示例文件夾,以下圖所示:
使用OpenCV的DNN模塊以及Caffe模型,必需要有.prototxt和.caffemodel兩種文件。但face_detector文件夾中,只有.prototxt一類文件,即缺乏訓練好的.caffemodel。.prototxt和.caffemodel的做用以下:
The .prototxt file(s) which define the model architecture (i.e., the layers themselves)
The .caffemodel file which contains the weights for the actual layers
face_detector文件分析:
deploy.prototxt:調用.caffemodel時的測試網絡文件
how_to_train_face_detector.txt:如何使用自定義數據集來訓練網絡的說明
solver.prototxt:超參數文件
test.prototxt:測試網絡文件
train.prototxt:訓練網絡文件
本教程直接使用訓練好的.caffemodel來進行人臉檢測,即只須要.caffemodel和deploy.prototxt兩個文件。若是想要使用本身的數據集來訓練網絡,請參考"how_to_train_face_detector.txt"。
2 ResNet-10和SSD簡介
本教程屬於實戰篇,故不深刻介紹算法內容,若對ResNet和SSD感興趣的同窗,能夠參考下述連接進行學習
[1]ResNet paper:https://arxiv.org/abs/1512.03385
[2]ResNet in Caffe:https://github.com/soeaver/caffe-model/tree/master/cls/resnet
[3]SSD paper:https://arxiv.org/abs/1512.02325
[4]SSD in Caffe:https://github.com/weiliu89/caffe/tree/ssd
3 .caffemodel下載
res10_300x300_ssd_iter_140000.caffemodel下載連接:https://anonfile.com/W7rdG4d0b1/face_detector.rar
4 C++版本代碼
4.1 圖像中的人臉檢測
對於OpenCV3.4版本,可直接使用opencv-3.4.1\samples\dnn文件夾中的resnet_ssd_face.cpp;
對於OpenCV3.3.1版本,可參考下述代碼:
face_detector_image.cpp
1// Summary: 使用OpenCV3.3.1中的face_detector對圖像進行人臉識別 2 3// Author: Amusi 4 5// Date: 2018-02-28 6 7#include <iostream> 8#include <opencv2/opencv.hpp> 9#include <opencv2/dnn.hpp> 10 11using namespace std; 12using namespace cv; 13using namespace cv::dnn; 14 15 16// Set the size of image and meanval 17const size_t inWidth = 300; 18const size_t inHeight = 300; 19const double inScaleFactor = 1.0; 20const Scalar meanVal(104.0, 177.0, 123.0); 21 22 23 24int main(int argc, char** argv) 25{ 26 // Load image 27 Mat img; 28 // Use commandline 29#if 0 30 31 if (argc < 2) 32 { 33 cerr<< "please input "<< endl; 34 cerr << "[Format]face_detector_img.exe image.jpg"<< endl; 35 return -1; 36 } 37 38 img = imread(argv[1]); 39 40#else 41 // Not use commandline 42 img = imread("iron_chic.jpg"); 43#endif 44 45 46 47 // Initialize Caffe network 48 float min_confidence = 0.5; 49 50 String modelConfiguration = "face_detector/deploy.prototxt"; 51 52 String modelBinary = "face_detector/res10_300x300_ssd_iter_140000.caffemodel"; 53 54 dnn::Net net = readNetFromCaffe(modelConfiguration, modelBinary); 55 56 57 58 if (net.empty()) 59 { 60 cerr << "Can't load network by using the following files: " << endl; 61 cerr << "prototxt: " << modelConfiguration << endl; 62 cerr << "caffemodel: " << modelBinary << endl; 63 cerr << "Models are available here:" << endl; 64 cerr << "<OPENCV_SRC_DIR>/samples/dnn/face_detector" << endl; 65 cerr << "or here:" << endl; 66 cerr << "https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector" << endl; 67 exit(-1); 68 } 69 70 71 72 // Prepare blob 73 Mat inputBlob = blobFromImage(img, inScaleFactor, Size(inWidth, inHeight), meanVal, false, false); 74 net.setInput(inputBlob, "data"); // set the network input 75 Mat detection = net.forward("detection_out"); // compute output 76 77 // Calculate and display time and frame rate 78 79 vector<double> layersTimings; 80 double freq = getTickFrequency() / 1000; 81 double time = net.getPerfProfile(layersTimings) / freq; 82 83 Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>()); 84 ostringstream ss; 85 ss << "FPS: " << 1000 / time << " ; time: " << time << "ms" << endl; 86 87 putText(img, ss.str(), Point(20,20), 0, 0.5, Scalar(0, 0, 255)); 88 89 float confidenceThreshold = min_confidence; 90 for (int i = 0; i < detectionMat.rows; ++i) 91 { 92 // judge confidence 93 float confidence = detectionMat.at<float>(i, 2); 94 95 if (confidence > confidenceThreshold) 96 { 97 int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * img.cols); 98 int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * img.rows); 99 int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * img.cols); 100 int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * img.rows); 101 Rect object((int)xLeftBottom, (int)yLeftBottom, (int (xRightTop - xLeftBottom), 102 (int)(yRightTop - yLeftBottom)); 103 rectangle(img, object, Scalar(0, 255, 0)); 104 ss.str(""); 105 ss << confidence; 106 String conf(ss.str()); 107 String label = "Face: " + conf; 108 int baseLine = 0; 109 Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine); 110 rectangle(img, Rect(Point(xLeftBottom, yLeftBottom-labelSize.height), 111 Size(labelSize.width, labelSize.height + baseLine)), 112 Scalar(255, 255, 255), CV_FILLED); 113 putText(img, label, Point(xLeftBottom, yLeftBottom), 114 FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0)); 115 116 } 117 } 118 119 namedWindow("Face Detection", WINDOW_NORMAL); 120 imshow("Face Detection", img); 121 waitKey(0); 122 123return 0; 124 125}
檢測結果
4.2 攝像頭/視頻中的人臉檢測
face_detector_video.cpp
1// Summary: 使用OpenCV3.3.1中的face_detector 2// Author: Amusi 3// Date: 2018-02-28 4// Reference: http://blog.csdn.net/minstyrain/article/details/78907425 5 6#include <iostream> 7#include <cstdlib> 8#include <stdio.h> 9#include <opencv2/opencv.hpp> 10#include <opencv2/dnn.hpp> 11#include <opencv2/dnn/shape_utils.hpp> 12 13using namespace cv; 14using namespace cv::dnn; 15using namespace std; 16const size_t inWidth = 300; 17const size_t inHeight = 300; 18const double inScaleFactor = 1.0; 19const Scalar meanVal(104.0, 177.0, 123.0); 20 21int main(int argc, char** argv) 22{ 23 float min_confidence = 0.5; 24 String modelConfiguration = "face_detector/deploy.prototxt"; 25 String modelBinary = "face_detector/res10_300x300_ssd_iter_140000.caffemodel"; 26 //! [Initialize network] 27 dnn::Net net = readNetFromCaffe(modelConfiguration, modelBinary); 28 //! [Initialize network] 29 if (net.empty()) 30 { 31 cerr << "Can't load network by using the following files: " << endl; 32 cerr << "prototxt: " << modelConfiguration << endl; 33 cerr << "caffemodel: " << modelBinary << endl; 34 cerr << "Models are available here:" << endl; 35 cerr << "<OPENCV_SRC_DIR>/samples/dnn/face_detector" << endl; 36 cerr << "or here:" << endl; 37 cerr << "https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector" << endl; 38 exit(-1); 39 } 40 41 VideoCapture cap(0); 42 if (!cap.isOpened()) 43 { 44 cout << "Couldn't open camera : " << endl; 45 return -1; 46 } 47 for (;;) 48 { 49 Mat frame; 50 cap >> frame; // get a new frame from camera/video or read image 51 52 if (frame.empty()) 53 { 54 waitKey(); 55 break; 56 } 57 58 if (frame.channels() == 4) 59 cvtColor(frame, frame, COLOR_BGRA2BGR); 60 61 //! [Prepare blob] 62 Mat inputBlob = blobFromImage(frame, inScaleFactor, 63 Size(inWidth, inHeight), meanVal, false, false); //Convert Mat to batch of images 64 //! [Prepare blob] 65 66 //! [Set input blob] 67 net.setInput(inputBlob, "data"); //set the network input 68 //! [Set input blob] 69 70 //! [Make forward pass] 71 Mat detection = net.forward("detection_out"); //compute output 72 //! [Make forward pass] 73 74 vector<double> layersTimings; 75 double freq = getTickFrequency() / 1000; 76 double time = net.getPerfProfile(layersTimings) / freq; 77 78 Mat detectionMat(detection.size[2], detection.size[3], CV_32F, detection.ptr<float>()); 79 80 ostringstream ss; 81 ss << "FPS: " << 1000 / time << " ; time: " << time << " ms"; 82 putText(frame, ss.str(), Point(20, 20), 0, 0.5, Scalar(0, 0, 255)); 83 84 float confidenceThreshold = min_confidence; 85 for (int i = 0; i < detectionMat.rows; i++) 86 { 87 float confidence = detectionMat.at<float>(i, 2); 88 89 if (confidence > confidenceThreshold) 90 { 91 int xLeftBottom = static_cast<int>(detectionMat.at<float>(i, 3) * frame.cols); 92 int yLeftBottom = static_cast<int>(detectionMat.at<float>(i, 4) * frame.rows); 93 int xRightTop = static_cast<int>(detectionMat.at<float>(i, 5) * frame.cols); 94 int yRightTop = static_cast<int>(detectionMat.at<float>(i, 6) * frame.rows); 95 96 Rect object((int)xLeftBottom, (int)yLeftBottom, 97 (int)(xRightTop - xLeftBottom), 98 (int)(yRightTop - yLeftBottom)); 99 100 rectangle(frame, object, Scalar(0, 255, 0)); 101 102 ss.str(""); 103 ss << confidence; 104 String conf(ss.str()); 105 String label = "Face: " + conf; 106 int baseLine = 0; 107 Size labelSize = getTextSize(label, FONT_HERSHEY_SIMPLEX, 0.5, 1, &baseLine); 108 rectangle(frame, Rect(Point(xLeftBottom, yLeftBottom - labelSize.height), 109 Size(labelSize.width, labelSize.height + baseLine)), 110 Scalar(255, 255, 255), CV_FILLED); 111 putText(frame, label, Point(xLeftBottom, yLeftBottom), 112 FONT_HERSHEY_SIMPLEX, 0.5, Scalar(0, 0, 0)); 113 } 114 } 115 cv::imshow("detections", frame); 116 if (waitKey(1) >= 0) break; 117 } 118 return 0; 119}
檢測結果
5 Python版本代碼
最簡單安裝Python版的OpenCV方法
pip install opencv-contrib-python
對於OpenCV3.4版本,可直接使用opencv-3.4.1\samples\dnn文件夾中的resnet_ssd_face_python.py;
對於OpenCV3.3.1版本,可參考下述代碼(本身寫的):
5.1 圖像中的人臉檢測
detect_faces.py
1# USAGE 2# python detect_faces.py --image rooster.jpg --prototxt deploy.prototxt.txt --model res10_300x300_ssd_iter_140000.caffemodel 3 4# import the necessary packages 5import numpy as np 6import argparse 7import cv2 8 9# construct the argument parse and parse the arguments 10ap = argparse.ArgumentParser() 11ap.add_argument("-i", "--image", required=True, 12 help="path to input image") 13ap.add_argument("-p", "--prototxt", required=True, 14 help="path to Caffe 'deploy' prototxt file") 15ap.add_argument("-m", "--model", required=True, 16 help="path to Caffe pre-trained model") 17ap.add_argument("-c", "--confidence", type=float, default=0.5, 18 help="minimum probability to filter weak detections") 19args = vars(ap.parse_args()) 20 21# load our serialized model from disk 22print("[INFO] loading model...") 23net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"]) 24 25# load the input image and construct an input blob for the image 26# by resizing to a fixed 300x300 pixels and then normalizing it 27image = cv2.imread(args["image"]) 28(h, w) = image.shape[:2] 29blob = cv2.dnn.blobFromImage(cv2.resize(image, (300, 300)), 1.0, 30 (300, 300), (104.0, 177.0, 123.0)) 31 32# pass the blob through the network and obtain the detections and 33# predictions 34print("[INFO] computing object detections...") 35net.setInput(blob) 36detections = net.forward() 37 38# loop over the detections 39for i in range(0, detections.shape[2]): 40 # extract the confidence (i.e., probability) associated with the 41 # prediction 42 confidence = detections[0, 0, i, 2] 43 44 # filter out weak detections by ensuring the `confidence` is 45 # greater than the minimum confidence 46 if confidence > args["confidence"]: 47 # compute the (x, y)-coordinates of the bounding box for the 48 # object 49 box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) 50 (startX, startY, endX, endY) = box.astype("int") 51 52 # draw the bounding box of the face along with the associated 53 # probability 54 text = "{:.2f}%".format(confidence * 100) 55 y = startY - 10 if startY - 10 > 10 else startY + 10 56 cv2.rectangle(image, (startX, startY), (endX, endY), 57 (0, 0, 255), 2) 58 cv2.putText(image, text, (startX, y), 59 cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2) 60 61# show the output image 62cv2.imshow("Output", image) 63cv2.waitKey(0)
打開cmd命令提示符,切換至路徑下,輸入下述命令:
python detect_faces.py --image rooster.jpg --prototxt deploy.prototxt.txt --model res10_300x300_ssd_iter_140000.caffemodel
檢測結果
python detect_faces.py --image iron_chic.jpg --prototxt deploy.prototxt.txt --model res10_300x300_ssd_iter_140000.caffemodel
檢測結果
5.2 攝像頭/視頻中的人臉檢測
detect_faces_video.py
1# USAGE 2# python detect_faces_video.py --prototxt deploy.prototxt.txt --model res10_300x300_ssd_iter_140000.caffemodel 3 4# import the necessary packages 5from imutils.video import VideoStream 6import numpy as np 7import argparse 8import imutils 9import time 10import cv2 11 12# construct the argument parse and parse the arguments 13ap = argparse.ArgumentParser() 14ap.add_argument("-p", "--prototxt", required=True, 15 help="path to Caffe 'deploy' prototxt file") 16ap.add_argument("-m", "--model", required=True, 17 help="path to Caffe pre-trained model") 18ap.add_argument("-c", "--confidence", type=float, default=0.5, 19 help="minimum probability to filter weak detections") 20args = vars(ap.parse_args()) 21 22# load our serialized model from disk 23print("[INFO] loading model...") 24net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"]) 25 26# initialize the video stream and allow the cammera sensor to warmup 27print("[INFO] starting video stream...") 28vs = VideoStream(src=0).start() 29time.sleep(2.0) 30 31# loop over the frames from the video stream 32while True: 33 # grab the frame from the threaded video stream and resize it 34 # to have a maximum width of 400 pixels 35 frame = vs.read() 36 frame = imutils.resize(frame, width=400) 37 38 # grab the frame dimensions and convert it to a blob 39 (h, w) = frame.shape[:2] 40 blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)), 1.0, 41 (300, 300), (104.0, 177.0, 123.0)) 42 43 # pass the blob through the network and obtain the detections and 44 # predictions 45 net.setInput(blob) 46 detections = net.forward() 47 48 # loop over the detections 49 for i in range(0, detections.shape[2]): 50 # extract the confidence (i.e., probability) associated with the 51 # prediction 52 confidence = detections[0, 0, i, 2] 53 54 # filter out weak detections by ensuring the `confidence` is 55 # greater than the minimum confidence 56 if confidence < args["confidence"]: 57 continue 58 59 # compute the (x, y)-coordinates of the bounding box for the 60 # object 61 box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) 62 (startX, startY, endX, endY) = box.astype("int") 63 64 # draw the bounding box of the face along with the associated 65 # probability 66 text = "{:.2f}%".format(confidence * 100) 67 y = startY - 10 if startY - 10 > 10 else startY + 10 68 cv2.rectangle(frame, (startX, startY), (endX, endY), 69 (0, 0, 255), 2) 70 cv2.putText(frame, text, (startX, y), 71 cv2.FONT_HERSHEY_SIMPLEX, 0.45, (0, 0, 255), 2) 72 73 # show the output frame 74 cv2.imshow("Frame", frame) 75 key = cv2.waitKey(1) & 0xFF 76 77 # if the `q` key was pressed, break from the loop 78 if key == ord("q"): 79 break 80 81# do a bit of cleanup 82cv2.destroyAllWindows() 83vs.stop()
打開cmd命令提示符,切換至路徑下,輸入下述命令:
python detect_faces_video.py --prototxt deploy.prototxt.txt --model res10_300x300_ssd_iter_140000.caffemodel
若是程序出錯,如ImportError: No module named imutils.video。這說明當前Python庫中沒有imutils庫,因此可使用pip安裝:
pip install imutils
檢測結果
總結
本教程介紹並使用了OpenCV最新提供的更加精確的人臉檢測器(與OpenCV的Haar級聯相比)。
這裏的OpenCV人臉檢測器是基於深度學習的,特別是利用ResNet和SSD框架做爲基礎網絡。
感謝Aleksandr Rybnikov、OpenCV dnn模塊和Adrian Rosebrock等其餘貢獻者的努力,咱們能夠在本身的應用中享受到這些更加精確的OpenCV人臉檢測器。
爲了你的方便,我已經爲你準備了本教程所使用的必要文件,請見下述內容。
代碼下載
deep-learning-face-detection.rar:https://anonfile.com/nft4G4d5b1/deep-learning-face-detection.rar
Reference
[1]Face detection with OpenCV and deep learning:https://www.pyimagesearch.com/2018/02/26/face-detection-with-opencv-and-deep-learning/
[2]face_detector:https://github.com/opencv/opencv/tree/master/samples/dnn/face_detector
[3]opencv3.4 發佈 dnnFace震撼來襲:http://blog.csdn.net/minstyrain/article/details/78907425