摘要: 活體檢測在各行各業應用比較普遍,如何實現一個活體檢測系統呢?早期實現很困難,如今僅使用opencv便可實現,快來嘗試一下吧。
隨着時代的發展,人臉識別系統的應用也正變得比以往任什麼時候候都更加廣泛。從智能手機上的人臉識別解鎖、到人臉識別打卡、門禁系統等,人臉識別系統正在各行各業獲得應用。然而,人臉識別系統很容易被「非真實」的面孔所欺騙。好比將人的照片放在人臉識別相機,就能夠騙過人臉識別系統,讓其識別爲人臉。
爲了令人臉識別系統更安全,咱們不只要識別出人臉,還須要可以檢測其是否爲真實面部,這就要用到活體檢測了。python
目前有許多活體檢測方法,包括:算法
面部識別系統工程師能夠組合上述方法挑選和選擇適合於其特定應用的活體檢測模型。但本教程將採用圖像處理中經常使用方法——卷積神經網絡(CNN)來構建一個可以區分真實面部和假面部的深度神經網絡(稱之爲「LivenessNet」網絡),將活體檢測視爲二元分類問題。
首先檢查一下數據集。安全
爲了讓例子更加簡單明瞭,本文構建的活體檢測器將側重於區分真實面孔與屏幕上的欺騙面孔。且該算法能夠很容易地擴展到其餘類型的欺騙面孔,包括打印輸出、高分辨率打印等。
活體檢測數據集來源:網絡
$ tree --dirsfirst --filelimit 10 . ├── dataset │ ├── fake [150 entries] │ └── real [161 entries] ├── face_detector │ ├── deploy.prototxt │ └── res10_300x300_ssd_iter_140000.caffemodel ├── pyimagesearch │ ├── __init__.py │ └── livenessnet.py ├── videos │ ├── fake.mp4 │ └── real.mov ├── gather_examples.py ├── train_liveness.py ├── liveness_demo.py ├── le.pickle ├── liveness.model └── plot.png 6 directories, 12 files
項目中主要有四個目錄:
*dataset /
:數據集目錄,包含兩類圖像:
在播放臉部視頻時,手機錄屏獲得的假臉;架構
face_detector /
:由預訓練Caffe面部檢測器組成,用於定位面部區域;pyimagesearch /
:模塊包含LivenessNet類函數;video/
:提供了兩個用於訓練了LivenessNet分類器的輸入視頻;另外還有三個Python腳本:app
gather_examples.py
:此腳本從輸入視頻文件中獲取面部區域,並建立深度學習面部數據集;train_liveness.py
:此腳本將訓練LivenessNet分類器。訓練會獲得如下幾個文件:dom
le .pickle
:類別標籤編碼器;liveness.model
:訓練好的Keras模型;plot.png
:訓練歷史圖顯示準確度和損失曲線;liveness_demo.py
:該演示腳本將啓動網絡攝像頭以進行面部實時活體檢測;
數據目錄:ide
dataset / fake /
:包含假.mp4文件中的面部區域;dataset / real /
:保存來自真實.mov文件的面部區域;打開 gather_examples.py
文件並插入如下代碼:函數
# import the necessary packages import numpy as np import argparse import cv2 import os # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-i", "--input", type=str, required=True, help="path to input video") ap.add_argument("-o", "--output", type=str, required=True, help="path to output directory of cropped faces") ap.add_argument("-d", "--detector", type=str, required=True, help="path to OpenCV's deep learning face detector") ap.add_argument("-c", "--confidence", type=float, default=0.5, help="minimum probability to filter weak detections") ap.add_argument("-s", "--skip", type=int, default=16, help="# of frames to skip before applying face detection") args = vars(ap.parse_args())
首先導入所需的包:
第8-19行解析命令行參數:oop
input
:輸入視頻文件的路徑;output
:輸出目錄的路徑;detector
:人臉檢測器的路徑;confidence
:人臉檢測的最小几率。默認值爲0.5;skip
:檢測時略過的幀數,默認值爲16;以後加載面部檢測器並初始化視頻流:
# load our serialized face detector from disk print("[INFO] loading face detector...") protoPath = os.path.sep.join([args["detector"], "deploy.prototxt"]) modelPath = os.path.sep.join([args["detector"], "res10_300x300_ssd_iter_140000.caffemodel"]) net = cv2.dnn.readNetFromCaffe(protoPath, modelPath) # open a pointer to the video file stream and initialize the total # number of frames read and saved thus far vs = cv2.VideoCapture(args["input"]) read = 0 saved = 0
此外還初始化了兩個變量,用於讀取的幀數以及循環執行時保存的幀數。
建立一個循環來處理幀:
# loop over frames from the video file stream while True: # grab the frame from the file (grabbed, frame) = vs.read() # if the frame was not grabbed, then we have reached the end # of the stream if not grabbed: break # increment the total number of frames read thus far read += 1 # check to see if we should process this frame if read % args["skip"] != 0: continue
下面進行面部檢測:
# grab the frame dimensions and construct a blob from the frame (h, w) = frame.shape[:2] blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)), 1.0, (300, 300), (104.0, 177.0, 123.0)) # pass the blob through the network and obtain the detections and # predictions net.setInput(blob) detections = net.forward() # ensure at least one face was found if len(detections) > 0: # we're making the assumption that each image has only ONE # face, so find the bounding box with the largest probability i = np.argmax(detections[0, 0, :, 2]) confidence = detections[0, 0, i, 2]
爲了執行面部檢測,須要從圖像中建立一個區域,該區域有300×300的寬度和高度,以適應Caffe面部檢測器。
此外腳本假設視頻的每一幀中只有一個面部,這有助於防止誤報。得到最高几率的面部檢測指數,並使用索引提取檢測的置信度,以後將低機率的進行過濾,並將結果寫入磁盤:
# ensure that the detection with the largest probability also # means our minimum probability test (thus helping filter out # weak detections) if confidence > args["confidence"]: # compute the (x, y)-coordinates of the bounding box for # the face and extract the face ROI box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) (startX, startY, endX, endY) = box.astype("int") face = frame[startY:endY, startX:endX] # write the frame to disk p = os.path.sep.join([args["output"], "{}.png".format(saved)]) cv2.imwrite(p, face) saved += 1 print("[INFO] saved {} to disk".format(p)) # do a bit of cleanup vs.release() cv2.destroyAllWindows()
提取到面部區域後,就能夠獲得面部的邊界框座標。而後爲面部區域生成路徑+文件名,並將其寫入磁盤中。
打開終端並執行如下命令來提取「假/欺騙」類別的面部圖像:
$ python gather_examples.py --input videos/real.mov --output dataset/real \ --detector face_detector --skip 1 [INFO] loading face detector... [INFO] saved datasets/fake/0.png to disk [INFO] saved datasets/fake/1.png to disk [INFO] saved datasets/fake/2.png to disk [INFO] saved datasets/fake/3.png to disk [INFO] saved datasets/fake/4.png to disk [INFO] saved datasets/fake/5.png to disk ... [INFO] saved datasets/fake/145.png to disk [INFO] saved datasets/fake/146.png to disk [INFO] saved datasets/fake/147.png to disk [INFO] saved datasets/fake/148.png to disk [INFO] saved datasets/fake/149.png to disk
同理也能夠執行如下命令得到「真實」類別的面部圖像:
$ python gather_examples.py --input videos/fake.mov --output dataset/fake \ --detector face_detector --skip 4 [INFO] loading face detector... [INFO] saved datasets/real/0.png to disk [INFO] saved datasets/real/1.png to disk [INFO] saved datasets/real/2.png to disk [INFO] saved datasets/real/3.png to disk [INFO] saved datasets/real/4.png to disk ... [INFO] saved datasets/real/156.png to disk [INFO] saved datasets/real/157.png to disk [INFO] saved datasets/real/158.png to disk [INFO] saved datasets/real/159.png to disk [INFO] saved datasets/real/160.png to disk
注意,這裏要確保數據分佈均衡。
執行腳本後,統計圖像數量:
LivenessNet實際上只是一個簡單的卷積神經網絡,儘可能將這個網絡設計的儘量淺,參數儘量少,緣由有兩個:
打開livenessnet .py
並插入如下代碼:
# import the necessary packages from keras.models import Sequential from keras.layers.normalization import BatchNormalization from keras.layers.convolutional import Conv2D from keras.layers.convolutional import MaxPooling2D from keras.layers.core import Activation from keras.layers.core import Flatten from keras.layers.core import Dropout from keras.layers.core import Dense from keras import backend as K class LivenessNet: @staticmethod def build(width, height, depth, classes): # initialize the model along with the input shape to be # "channels last" and the channels dimension itself model = Sequential() inputShape = (height, width, depth) chanDim = -1 # if we are using "channels first", update the input shape # and channels dimension if K.image_data_format() == "channels_first": inputShape = (depth, height, width) chanDim = 1 # first CONV => RELU => CONV => RELU => POOL layer set model.add(Conv2D(16, (3, 3), padding="same", input_shape=inputShape)) model.add(Activation("relu")) model.add(BatchNormalization(axis=chanDim)) model.add(Conv2D(16, (3, 3), padding="same")) model.add(Activation("relu")) model.add(BatchNormalization(axis=chanDim)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) # second CONV => RELU => CONV => RELU => POOL layer set model.add(Conv2D(32, (3, 3), padding="same")) model.add(Activation("relu")) model.add(BatchNormalization(axis=chanDim)) model.add(Conv2D(32, (3, 3), padding="same")) model.add(Activation("relu")) model.add(BatchNormalization(axis=chanDim)) model.add(MaxPooling2D(pool_size=(2, 2))) model.add(Dropout(0.25)) # first (and only) set of FC => RELU layers model.add(Flatten()) model.add(Dense(64)) model.add(Activation("relu")) model.add(BatchNormalization()) model.add(Dropout(0.5)) # softmax classifier model.add(Dense(classes)) model.add(Activation("softmax")) # return the constructed network architecture return model
打開train_liveness .py
文件並插入如下代碼:
# set the matplotlib backend so figures can be saved in the background import matplotlib matplotlib.use("Agg") # import the necessary packages from pyimagesearch.livenessnet import LivenessNet from sklearn.preprocessing import LabelEncoder from sklearn.model_selection import train_test_split from sklearn.metrics import classification_report from keras.preprocessing.image import ImageDataGenerator from keras.optimizers import Adam from keras.utils import np_utils from imutils import paths import matplotlib.pyplot as plt import numpy as np import argparse import pickle import cv2 import os # construct the argument parser and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-d", "--dataset", required=True, help="path to input dataset") ap.add_argument("-m", "--model", type=str, required=True, help="path to trained model") ap.add_argument("-l", "--le", type=str, required=True, help="path to label encoder") ap.add_argument("-p", "--plot", type=str, default="plot.png", help="path to output loss/accuracy plot") args = vars(ap.parse_args())
此腳本接受四個命令行參數:
dataset
:輸入數據集的路徑;model
:輸出模型文件保存路徑;le
:輸出序列化標籤編碼器文件的路徑;plot
:訓練腳本將生成一個圖;下一個代碼塊將執行初始化並構建數據:
# initialize the initial learning rate, batch size, and number of # epochs to train for INIT_LR = 1e-4 BS = 8 EPOCHS = 50 # grab the list of images in our dataset directory, then initialize # the list of data (i.e., images) and class images print("[INFO] loading images...") imagePaths = list(paths.list_images(args["dataset"])) data = [] labels = [] for imagePath in imagePaths: # extract the class label from the filename, load the image and # resize it to be a fixed 32x32 pixels, ignoring aspect ratio label = imagePath.split(os.path.sep)[-2] image = cv2.imread(imagePath) image = cv2.resize(image, (32, 32)) # update the data and labels lists, respectively data.append(image) labels.append(label) # convert the data into a NumPy array, then preprocess it by scaling # all pixel intensities to the range [0, 1] data = np.array(data, dtype="float") / 255.0
以後對標籤進行獨熱編碼並對將數據劃分爲訓練數據(75%)和測試數據(25%):
# encode the labels (which are currently strings) as integers and then # one-hot encode them le = LabelEncoder() labels = le.fit_transform(labels) labels = np_utils.to_categorical(labels, 2) # partition the data into training and testing splits using 75% of # the data for training and the remaining 25% for testing (trainX, testX, trainY, testY) = train_test_split(data, labels, test_size=0.25, random_state=42)
以後對數據進行擴充並對模型進行編譯和訓練:
# construct the training image generator for data augmentation aug = ImageDataGenerator(rotation_range=20, zoom_range=0.15, width_shift_range=0.2, height_shift_range=0.2, shear_range=0.15, horizontal_flip=True, fill_mode="nearest") # initialize the optimizer and model print("[INFO] compiling model...") opt = Adam(lr=INIT_LR, decay=INIT_LR / EPOCHS) model = LivenessNet.build(width=32, height=32, depth=3, classes=len(le.classes_)) model.compile(loss="binary_crossentropy", optimizer=opt, metrics=["accuracy"]) # train the network print("[INFO] training network for {} epochs...".format(EPOCHS)) H = model.fit_generator(aug.flow(trainX, trainY, batch_size=BS), validation_data=(testX, testY), steps_per_epoch=len(trainX) // BS, epochs=EPOCHS)
模型訓練後,能夠評估效果並生成仿真曲線圖:
# evaluate the network print("[INFO] evaluating network...") predictions = model.predict(testX, batch_size=BS) print(classification_report(testY.argmax(axis=1), predictions.argmax(axis=1), target_names=le.classes_)) # save the network to disk print("[INFO] serializing network to '{}'...".format(args["model"])) model.save(args["model"]) # save the label encoder to disk f = open(args["le"], "wb") f.write(pickle.dumps(le)) f.close() # plot the training loss and accuracy plt.style.use("ggplot") plt.figure() plt.plot(np.arange(0, EPOCHS), H.history["loss"], label="train_loss") plt.plot(np.arange(0, EPOCHS), H.history["val_loss"], label="val_loss") plt.plot(np.arange(0, EPOCHS), H.history["acc"], label="train_acc") plt.plot(np.arange(0, EPOCHS), H.history["val_acc"], label="val_acc") plt.title("Training Loss and Accuracy on Dataset") plt.xlabel("Epoch #") plt.ylabel("Loss/Accuracy") plt.legend(loc="lower left") plt.savefig(args["plot"])
執行如下命令開始模型訓練:
$ python train.py --dataset dataset --model liveness.model --le le.pickle [INFO] loading images... [INFO] compiling model... [INFO] training network for 50 epochs... Epoch 1/50 29/29 [==============================] - 2s 58ms/step - loss: 1.0113 - acc: 0.5862 - val_loss: 0.4749 - val_acc: 0.7436 Epoch 2/50 29/29 [==============================] - 1s 21ms/step - loss: 0.9418 - acc: 0.6127 - val_loss: 0.4436 - val_acc: 0.7949 Epoch 3/50 29/29 [==============================] - 1s 21ms/step - loss: 0.8926 - acc: 0.6472 - val_loss: 0.3837 - val_acc: 0.8077 ... Epoch 48/50 29/29 [==============================] - 1s 21ms/step - loss: 0.2796 - acc: 0.9094 - val_loss: 0.0299 - val_acc: 1.0000 Epoch 49/50 29/29 [==============================] - 1s 21ms/step - loss: 0.3733 - acc: 0.8792 - val_loss: 0.0346 - val_acc: 0.9872 Epoch 50/50 29/29 [==============================] - 1s 21ms/step - loss: 0.2660 - acc: 0.9008 - val_loss: 0.0322 - val_acc: 0.9872 [INFO] evaluating network... precision recall f1-score support fake 0.97 1.00 0.99 35 real 1.00 0.98 0.99 43 micro avg 0.99 0.99 0.99 78 macro avg 0.99 0.99 0.99 78 weighted avg 0.99 0.99 0.99 78 [INFO] serializing network to 'liveness.model'...
從上述結果來看,在測試集上得到99%的檢測精度!
最後一步是將全部部分組合在一塊兒:
打開`liveness_demo.py並插入如下代碼:
# import the necessary packages from imutils.video import VideoStream from keras.preprocessing.image import img_to_array from keras.models import load_model import numpy as np import argparse import imutils import pickle import time import cv2 import os # construct the argument parse and parse the arguments ap = argparse.ArgumentParser() ap.add_argument("-m", "--model", type=str, required=True, help="path to trained model") ap.add_argument("-l", "--le", type=str, required=True, help="path to label encoder") ap.add_argument("-d", "--detector", type=str, required=True, help="path to OpenCV's deep learning face detector") ap.add_argument("-c", "--confidence", type=float, default=0.5, help="minimum probability to filter weak detections") args = vars(ap.parse_args())
上述代碼導入必要的包,並加載模型。
下面初始化人臉檢測器、LivenessNet模型以及視頻流:
# load our serialized face detector from disk print("[INFO] loading face detector...") protoPath = os.path.sep.join([args["detector"], "deploy.prototxt"]) modelPath = os.path.sep.join([args["detector"], "res10_300x300_ssd_iter_140000.caffemodel"]) net = cv2.dnn.readNetFromCaffe(protoPath, modelPath) # load the liveness detector model and label encoder from disk print("[INFO] loading liveness detector...") model = load_model(args["model"]) le = pickle.loads(open(args["le"], "rb").read()) # initialize the video stream and allow the camera sensor to warmup print("[INFO] starting video stream...") vs = VideoStream(src=0).start() time.sleep(2.0)
以後開始循環遍歷視頻的每一幀以檢測面部是否真實:
# loop over the frames from the video stream while True: # grab the frame from the threaded video stream and resize it # to have a maximum width of 600 pixels frame = vs.read() frame = imutils.resize(frame, width=600) # grab the frame dimensions and convert it to a blob (h, w) = frame.shape[:2] blob = cv2.dnn.blobFromImage(cv2.resize(frame, (300, 300)), 1.0, (300, 300), (104.0, 177.0, 123.0)) # pass the blob through the network and obtain the detections and # predictions net.setInput(blob) detections = net.forward()
使用OpenCV blobFromImage函數生成一個面部數據,而後將其傳遞到面部檢測器網絡繼續進行推理。核心代碼以下:
# loop over the detections for i in range(0, detections.shape[2]): # extract the confidence (i.e., probability) associated with the # prediction confidence = detections[0, 0, i, 2] # filter out weak detections if confidence > args["confidence"]: # compute the (x, y)-coordinates of the bounding box for # the face and extract the face ROI box = detections[0, 0, i, 3:7] * np.array([w, h, w, h]) (startX, startY, endX, endY) = box.astype("int") # ensure the detected bounding box does fall outside the # dimensions of the frame startX = max(0, startX) startY = max(0, startY) endX = min(w, endX) endY = min(h, endY) # extract the face ROI and then preproces it in the exact # same manner as our training data face = frame[startY:endY, startX:endX] face = cv2.resize(face, (32, 32)) face = face.astype("float") / 255.0 face = img_to_array(face) face = np.expand_dims(face, axis=0) # pass the face ROI through the trained liveness detector # model to determine if the face is "real" or "fake" preds = model.predict(face)[0] j = np.argmax(preds) label = le.classes_[j] # draw the label and bounding box on the frame label = "{}: {:.4f}".format(label, preds[j]) cv2.putText(frame, label, (startX, startY - 10), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0, 0, 255), 2) cv2.rectangle(frame, (startX, startY), (endX, endY), (0, 0, 255), 2)
首先過濾掉弱檢測結果,而後提取面部圖像並對其進行預處理,以後送入到活動檢測器模型來肯定面部是「真實的」仍是「假的/欺騙的」。最後,在原圖上繪製標籤和添加文本以及矩形框,最後進行展現和清理。
# show the output frame and wait for a key press cv2.imshow("Frame", frame) key = cv2.waitKey(1) & 0xFF # if the `q` key was pressed, break from the loop if key == ord("q"): break # do a bit of cleanup cv2.destroyAllWindows() vs.stop()
打開終端並執行如下命令:
$ python liveness_demo.py --model liveness.model --le le.pickle \ --detector face_detector Using TensorFlow backend. [INFO] loading face detector... [INFO] loading liveness detector... [INFO] starting video stream...
能夠看到,活體檢測器成功地區分了真實和僞造的面孔。下面的視頻做爲一個更長時間的演示:
視頻地址
本文設計的系統還有一些限制和缺陷,主要限制其實是數據集有限——總共只有311個圖像。這項工做的第一個擴展之一是簡單地收集額外的訓練數據,好比其它人,其它膚色或種族的人。
此外,活體檢測器只是經過屏幕上的惡搞攻擊進行訓練,它並無通過打印出來的圖像或照片的訓練。所以,建議添加不一樣類型的圖像源。
最後,我想提一下,活體檢測沒有最好的方法,只有最合適的方法。一些好的活體檢測器包含多種活體檢測方法。
在本教程中,學習瞭如何使用OpenCV進行活動檢測。使用此活體檢測器就能夠在本身的人臉識別系統中發現僞造的假臉並進行反面部欺騙。此外,建立活動檢測器使用了OpenCV、Deep Learning和Python等領域的知識。整個過程以下:
第一步是收集真假數據集。數據來源有:
第二步,得到數據集以後,實現了「LivenessNet」網絡,該網絡設計的比較淺層,這是爲了確保:
總的來講,本文設計的活體檢測器可以在驗證集上得到99%的準確度。此外,活動檢測器也可以應用於實時視頻流。
本文爲雲棲社區原創內容,未經容許不得轉載。