OpenCV人臉識別LBPH算法源碼分析

時間 2019-11-13

標籤 opencv 識別 lbph 算法源碼分析简体版

原文原文鏈接

1 背景及理論基礎

人臉識別是指將一個須要識別的人臉和人臉庫中的某我的臉對應起來（相似於指紋識別），目的是完成識別功能，該術語須要和人臉檢測進行區分，人臉檢測是在一張圖片中把人臉定位出來，完成的是搜尋的功能。從OpenCV2.4開始，加入了新的類FaceRecognizer，該類用於人臉識別，使用它能夠方便地進行相關識別實驗。html

原始的LBP算子定義爲在3*3的窗口內，以窗口中心像素爲閾值，將相鄰的8個像素的灰度值與其進行比較，若周圍像素值大於或等於中心像素值，則該像素點的位置被標記爲1，不然爲0。這樣，3*3鄰域內的8個點經比較可產生8位二進制數（一般轉換爲十進制數即LBP碼，共256種），即獲得該窗口中心像素點的LBP值，並用這個值來反映該區域的紋理特徵。以下圖所示：python

原始的LBP提出後，研究人員不斷對其提出了各類改進和優化。ios

1.1 圓形LBP算子

基本的 LBP算子的最大缺陷在於它只覆蓋了一個固定半徑範圍內的小區域，這顯然不能知足不一樣尺寸和頻率紋理的須要。爲了適應不一樣尺度的紋理特徵，Ojala等對LBP算子進行了改進，將3×3鄰域擴展到任意鄰域，並用圓形鄰域代替了正方形鄰域，改進後的LBP算子容許在半徑爲R的圓形鄰域內有任意多個像素點，從而獲得了諸如半徑爲R的圓形區域內含有P個採樣點的LBP算子，OpenCV中正是使用圓形LBP算子，下圖示意了圓形LBP算子：數據庫

1.2 旋轉不變模式

從LBP的定義能夠看出，LBP算子是灰度不變的，但卻不是旋轉不變的，圖像的旋轉就會獲得不一樣的LBP值。Maenpaa等人又將LBP算子進行了擴展，提出了具備旋轉不變性的LBP算子，即不斷旋轉圓形鄰域獲得一系列初始定義的LBP值，取其最小值做爲該鄰域的LBP值。下圖給出了求取旋轉不變LBP的過程示意圖，圖中算子下方的數字表示該算子對應的LBP值，圖中所示的8種LBP模式，通過旋轉不變的處理，最終獲得的具備旋轉不變性的LBP值爲15。也就是說，圖中的8種LBP模式對應的旋轉不變的LBP碼值都是00001111。數組

1.3 等價模式

一個LBP算子能夠產生不一樣的二進制模式，對於半徑爲R的圓形區域內含有P個採樣點的LBP算子將會產生P²種模式。很顯然，隨着鄰域集內採樣點數的增長，二進制模式的種類是急劇增長的。例如：5×5鄰域內20個採樣點，有2²⁰＝1,048,576種二進制模式。如此多的二值模式不管對於紋理的提取仍是對於紋理的識別、分類及信息的存取都是不利的。爲了解決二進制模式過多的問題，提升統計性，Ojala提出了採用一種「等價模式」（Uniform Pattern）來對LBP算子的模式種類進行降維。Ojala等認爲，在實際圖像中，絕大多數LBP模式最多隻包含兩次從1到0或從0到1的跳變。所以，Ojala將「等價模式」定義爲：當某個局部二進制模式所對應的循環二進制數從0到1或從1到0最多有兩次跳變時，該局部二進制模式所對應的二進制就成爲一個等價模式類。如00000000（0次跳變），00000111（含一次從0到1的跳變和一次1到0的跳變），10001111（先由1跳到0，再由0跳到1，共兩次跳變）都是等價模式類。除等價模式類之外的模式都歸爲另外一類，稱爲混合模式類，例如10010111（共四次跳變）。app

經過這樣的改進，二進制模式的種類大大減小，模式數量由原來的2^P種減小爲P(P-1)+2+1種，其中P表示鄰域集內的採樣點數，等價模式類包含P(P-1)+2種模式，混合模式類只有1種模式。對於3×3鄰域內8個採樣點來講，二進制模式由原始的256種減小爲59種，這使得特徵向量的維數更少，而且能夠減小高頻噪聲帶來的影響。curl

2 LBP特徵用於檢測的原理

顯而易見的是，上述提取的LBP算子在每一個像素點均可以獲得一個LBP「編碼」，那麼，對一幅圖像（記錄的是每一個像素點的灰度值）提取其原始的LBP算子以後，獲得的原始LBP特徵依然是「一幅圖片」（記錄的是每一個像素點的LBP值），如圖所示：ide

若是將以上獲得的LBP圖直接用於人臉識別，其實和不提取LBP特徵沒什麼區別，在實際的LBP應用中通常採用LBP特徵譜的統計直方圖做爲特徵向量進行分類識別，而且能夠將一幅圖片劃分爲若干的子區域，對每一個子區域內的每一個像素點都提取LBP特徵，而後，在每一個子區域內創建LBP特徵的統計直方圖。如此一來，每一個子區域，就能夠用一個統計直方圖來進行描述，整個圖片就由若干個統計直方圖組成，這樣作的好處是在必定範圍內減少圖像沒徹底對準而產生的偏差，分區的另一個意義在於咱們能夠根據不一樣的子區域給予不一樣的權重，好比說咱們認爲中心部分分區的權重大於邊緣部分分區的權重，意思就是說中心部分在進行圖片匹配識別時的意義更爲重大。例如：一幅100*100像素大小的圖片，劃分爲10*10=100個子區域（能夠經過多種方式來劃分區域），每一個子區域的大小爲10*10像素；在每一個子區域內的每一個像素點，提取其LBP特徵，而後，創建統計直方圖；這樣，這幅圖片就有10*10個子區域，也就有了10*10個統計直方圖，利用這10*10個統計直方圖，就能夠描述這幅圖片了。以後，咱們利用各類類似性度量函數，就能夠判斷兩幅圖像之間的類似性了，OpenCV在LBP人臉識別中使用的是以下類似度公式：函數

3 LBPH人臉識別關鍵部分源碼

以OpenCV2.4.9爲例，LBPH類源碼該文件——opencv2.4.9\sources\modules\contrib\src\facerec.cpp中，如LBPH類建立函數的聲明及實現以下：post

CV_EXPORTS_W Ptr<FaceRecognizer> createLBPHFaceRecognizer(int radius=1, int neighbors=8,int grid_x=8, int grid_y=8, double threshold = DBL_MAX);

Ptr<FaceRecognizer> createLBPHFaceRecognizer(int radius, int neighbors,int grid_x, int grid_y, double threshold)
{
    return new LBPH(radius, neighbors, grid_x, grid_y, threshold);
}

FaceRecognizer

由代碼可見LBPH使用圓形LBP算子，默認狀況下，圓的半徑是1，採樣點P爲8，x方向和y方向上的分區個數都爲8，即有8*8=64個分區，最後一個參數爲類似度閾值，待識別圖像也圖像庫中圖像類似度小於該值時纔會產生匹配結果。對於LBPH類咱們首先看一下其訓練過程函數train：

void LBPH::train(InputArrayOfArrays _in_src, InputArray _in_labels, bool preserveData) {
    if(_in_src.kind() != _InputArray::STD_VECTOR_MAT && _in_src.kind() != _InputArray::STD_VECTOR_VECTOR) {
        string error_message = "The images are expected as InputArray::STD_VECTOR_MAT (a std::vector<Mat>) or _InputArray::STD_VECTOR_VECTOR (a std::vector< vector<...> >).";
        CV_Error(CV_StsBadArg, error_message);
    }
    if(_in_src.total() == 0) {
        string error_message = format("Empty training data was given. You'll need more than one sample to learn a model.");
        CV_Error(CV_StsUnsupportedFormat, error_message);
    } else if(_in_labels.getMat().type() != CV_32SC1) {
        string error_message = format("Labels must be given as integer (CV_32SC1). Expected %d, but was %d.", CV_32SC1, _in_labels.type());
        CV_Error(CV_StsUnsupportedFormat, error_message);
    }
    // get the vector of matrices
    vector<Mat> src;
    _in_src.getMatVector(src);
    // get the label matrix
    Mat labels = _in_labels.getMat();
    // check if data is well- aligned
    if(labels.total() != src.size()) {
        string error_message = format("The number of samples (src) must equal the number of labels (labels). Was len(samples)=%d, len(labels)=%d.", src.size(), _labels.total());
        CV_Error(CV_StsBadArg, error_message);
    }
    // if this model should be trained without preserving old data, delete old model data
    if(!preserveData) {
        _labels.release();
        _histograms.clear();
    }
    // append labels to _labels matrix
    for(size_t labelIdx = 0; labelIdx < labels.total(); labelIdx++) {
        _labels.push_back(labels.at<int>((int)labelIdx));
    }
    // store the spatial histograms of the original data
    for(size_t sampleIdx = 0; sampleIdx < src.size(); sampleIdx++) {
        // calculate lbp image
        Mat lbp_image = elbp(src[sampleIdx], _radius, _neighbors);
        // get spatial histogram from this lbp image
        Mat p = spatial_histogram(
                lbp_image, /* lbp_image */
                static_cast<int>(std::pow(2.0, static_cast<double>(_neighbors))), /* number of possible patterns */
                _grid_x, /* grid size x */
                _grid_y, /* grid size y */
                true);
        // add to templates
        _histograms.push_back(p);
    }
}

train

template <typename _Tp> static
inline void elbp_(InputArray _src, OutputArray _dst, int radius, int neighbors) {
    //get matrices
    Mat src = _src.getMat();
    // allocate memory for result
    _dst.create(src.rows-2*radius, src.cols-2*radius, CV_32SC1);
    Mat dst = _dst.getMat();
    // zero
    dst.setTo(0);
    for(int n=0; n<neighbors; n++) {
        // sample points
        float x = static_cast<float>(radius * cos(2.0*CV_PI*n/static_cast<float>(neighbors)));
        float y = static_cast<float>(-radius * sin(2.0*CV_PI*n/static_cast<float>(neighbors)));
        // relative indices
        int fx = static_cast<int>(floor(x));
        int fy = static_cast<int>(floor(y));
        int cx = static_cast<int>(ceil(x));
        int cy = static_cast<int>(ceil(y));
        // fractional part
        float ty = y - fy;
        float tx = x - fx;
        // set interpolation weights
        float w1 = (1 - tx) * (1 - ty);
        float w2 =      tx  * (1 - ty);
        float w3 = (1 - tx) *      ty;
        float w4 =      tx  *      ty;
        // iterate through your data
        for(int i=radius; i < src.rows-radius;i++) {
            for(int j=radius;j < src.cols-radius;j++) {
                // calculate interpolated value
                float t = static_cast<float>(w1*src.at<_Tp>(i+fy,j+fx) + w2*src.at<_Tp>(i+fy,j+cx) + w3*src.at<_Tp>(i+cy,j+fx) + w4*src.at<_Tp>(i+cy,j+cx));
                // floating point precision, so check some machine-dependent epsilon
                dst.at<int>(i-radius,j-radius) += ((t > src.at<_Tp>(i,j)) || (std::abs(t-src.at<_Tp>(i,j)) < std::numeric_limits<float>::epsilon())) << n;
            }
        }
    }
}

static void elbp(InputArray src, OutputArray dst, int radius, int neighbors)
{
    int type = src.type();
    switch (type) {
    case CV_8SC1:   elbp_<char>(src,dst, radius, neighbors); break;
    case CV_8UC1:   elbp_<unsigned char>(src, dst, radius, neighbors); break;
    case CV_16SC1:  elbp_<short>(src,dst, radius, neighbors); break;
    case CV_16UC1:  elbp_<unsigned short>(src,dst, radius, neighbors); break;
    case CV_32SC1:  elbp_<int>(src,dst, radius, neighbors); break;
    case CV_32FC1:  elbp_<float>(src,dst, radius, neighbors); break;
    case CV_64FC1:  elbp_<double>(src,dst, radius, neighbors); break;
    default:
        string error_msg = format("Using Original Local Binary Patterns for feature extraction only works on single-channel images (given %d). Please pass the image data as a grayscale image!", type);
        CV_Error(CV_StsNotImplemented, error_msg);
        break;
    }
}

static Mat
histc_(const Mat& src, int minVal=0, int maxVal=255, bool normed=false)
{
    Mat result;
    // Establish the number of bins.
    int histSize = maxVal-minVal+1;
    // Set the ranges.
    float range[] = { static_cast<float>(minVal), static_cast<float>(maxVal+1) };
    const float* histRange = { range };
    // calc histogram
    calcHist(&src, 1, 0, Mat(), result, 1, &histSize, &histRange, true, false);
    // normalize
    if(normed) {
        result /= (int)src.total();
    }
    return result.reshape(1,1);
}

static Mat histc(InputArray _src, int minVal, int maxVal, bool normed)
{
    Mat src = _src.getMat();
    switch (src.type()) {
        case CV_8SC1:
            return histc_(Mat_<float>(src), minVal, maxVal, normed);
            break;
        case CV_8UC1:
            return histc_(src, minVal, maxVal, normed);
            break;
        case CV_16SC1:
            return histc_(Mat_<float>(src), minVal, maxVal, normed);
            break;
        case CV_16UC1:
            return histc_(src, minVal, maxVal, normed);
            break;
        case CV_32SC1:
            return histc_(Mat_<float>(src), minVal, maxVal, normed);
            break;
        case CV_32FC1:
            return histc_(src, minVal, maxVal, normed);
            break;
        default:
            CV_Error(CV_StsUnmatchedFormats, "This type is not implemented yet."); break;
    }
    return Mat();
}


static Mat spatial_histogram(InputArray _src, int numPatterns,
                             int grid_x, int grid_y, bool /*normed*/)
{
    Mat src = _src.getMat();
    // calculate LBP patch size
    int width = src.cols/grid_x;
    int height = src.rows/grid_y;
    // allocate memory for the spatial histogram
    Mat result = Mat::zeros(grid_x * grid_y, numPatterns, CV_32FC1);
    // return matrix with zeros if no data was given
    if(src.empty())
        return result.reshape(1,1);
    // initial result_row
    int resultRowIdx = 0;
    // iterate through grid
    for(int i = 0; i < grid_y; i++) {
        for(int j = 0; j < grid_x; j++) {
            Mat src_cell = Mat(src, Range(i*height,(i+1)*height), Range(j*width,(j+1)*width));
            Mat cell_hist = histc(src_cell, 0, (numPatterns-1), true);
            // copy to the result matrix
            Mat result_row = result.row(resultRowIdx);
            cell_hist.reshape(1,1).convertTo(result_row, CV_32FC1);
            // increase row count in result matrix
            resultRowIdx++;
        }
    }
    // return result as reshaped feature vector
    return result.reshape(1,1);
}

//------------------------------------------------------------------------------
// wrapper to cv::elbp (extended local binary patterns)
//------------------------------------------------------------------------------

static Mat elbp(InputArray src, int radius, int neighbors) {
    Mat dst;
    elbp(src, dst, radius, neighbors);
    return dst;
}

elbp和spatial_histogram

須要注意的是在求圖像中每一個位置的8個採樣點的值時，是使用的採樣點四個角上相應位置的加權平均值才做爲採樣點的值（見上面函數elbp_中12~35行處代碼），這樣作能下降噪音點對LBP值的影響。而spatial_histogram函數把最後的分區直方圖結果reshape成一行，這樣作能方便識別時的類似度計算。識別函數有predict函數實現，源代碼以下：

void LBPH::predict(InputArray _src, int &minClass, double &minDist) const {
    if(_histograms.empty()) {
        // throw error if no data (or simply return -1?)
        string error_message = "This LBPH model is not computed yet. Did you call the train method?";
        CV_Error(CV_StsBadArg, error_message);
    }
    Mat src = _src.getMat();
    // get the spatial histogram from input image
    Mat lbp_image = elbp(src, _radius, _neighbors);
    Mat query = spatial_histogram(
            lbp_image, /* lbp_image */
            static_cast<int>(std::pow(2.0, static_cast<double>(_neighbors))), /* number of possible patterns */
            _grid_x, /* grid size x */
            _grid_y, /* grid size y */
            true /* normed histograms */);
    // find 1-nearest neighbor
    minDist = DBL_MAX;
    minClass = -1;
    for(size_t sampleIdx = 0; sampleIdx < _histograms.size(); sampleIdx++) {
        double dist = compareHist(_histograms[sampleIdx], query, CV_COMP_CHISQR);
        if((dist < minDist) && (dist < _threshold)) {
            minDist = dist;
            minClass = _labels.at<int>((int) sampleIdx);
        }
    }
}

predict

函數中7~15行是計算帶預測圖片_src的分區直方圖query，19~25行的for循環分別比較query和人臉庫直方圖數組_histograms中每個直方圖的類似度（比較方法正是CV_COMP_CHISQR），並把類似度最小的做爲最終結果，該部分也能夠當作建立LBPH類時threshold的做用，即類似度都不小於threshold閾值則識別失敗。

4 LBP人臉識別示例

最後給出LBP人臉識別的示例代碼，代碼中使用的人臉庫是AT&T人臉庫（又稱ORL人臉數據庫），庫中有40我的，每人10張照片，共400張人臉照片。示例代碼以下：

#include "opencv2/core/core.hpp"
#include "opencv2/highgui/highgui.hpp"
#include "opencv2/contrib/contrib.hpp"

#define CV_VERSION_ID       CVAUX_STR(CV_MAJOR_VERSION) CVAUX_STR(CV_MINOR_VERSION) CVAUX_STR(CV_SUBMINOR_VERSION)

#ifdef _DEBUG
#define cvLIB(name) "opencv_" name CV_VERSION_ID "d"
#else
#define cvLIB(name) "opencv_" name CV_VERSION_ID
#endif

#pragma comment( lib, cvLIB("core") )
#pragma comment( lib, cvLIB("imgproc") )
#pragma comment( lib, cvLIB("highgui") )
#pragma comment( lib, cvLIB("flann") )
#pragma comment( lib, cvLIB("features2d") )
#pragma comment( lib, cvLIB("calib3d") )
#pragma comment( lib, cvLIB("gpu") )
#pragma comment( lib, cvLIB("legacy") )
#pragma comment( lib, cvLIB("ml") )
#pragma comment( lib, cvLIB("objdetect") )
#pragma comment( lib, cvLIB("ts") )
#pragma comment( lib, cvLIB("video") )
#pragma comment( lib, cvLIB("contrib") )
#pragma comment( lib, cvLIB("nonfree") )

#include <iostream>
#include <fstream>
#include <sstream>

using namespace cv;
using namespace std;

static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator =';') {
    std::ifstream file(filename.c_str(), ifstream::in);
    if (!file) {
        string error_message ="No valid input file was given, please check the given filename.";
        CV_Error(CV_StsBadArg, error_message);
    }
    string line, path, classlabel;
    while (getline(file, line)) {
        stringstream liness(line);
        getline(liness, path, separator);
        getline(liness, classlabel);
        if(!path.empty()&&!classlabel.empty()) {
            images.push_back(imread(path, 0));
            labels.push_back(atoi(classlabel.c_str()));
        }
    }
}

int main(int argc, const char *argv[]) {
    if (argc !=2) {
        cout <<"usage: "<< argv[0]<<" <csv.ext>"<< endl;
        exit(1);
    }
    string fn_csv = string(argv[1]);
    vector<Mat> images;
    vector<int> labels;
    try {
        read_csv(fn_csv, images, labels);
    } catch (cv::Exception& e) {
        cerr <<"Error opening file "<< fn_csv <<". Reason: "<< e.msg << endl;
        // nothing more we can do
        exit(1);
    }
    if(images.size()<=1) {
        string error_message ="This demo needs at least 2 images to work. Please add more images to your data set!";
        CV_Error(CV_StsError, error_message);
    }
    int height = images[0].rows;
    Mat testSample = images[images.size() -1];
    int testLabel = labels[labels.size() -1];
    images.pop_back();
    labels.pop_back();
    // TLBPHFaceRecognizer 使用了擴展的LBP
    // 在其餘的算子中他可能很容易被擴展
    // 下面是默認參數
    //      radius = 1
    //      neighbors = 8
    //      grid_x = 8
    //      grid_y = 8
    //
    // 若是你要建立 LBPH FaceRecognizer 半徑是2，16個鄰域
    //      cv::createLBPHFaceRecognizer(2, 16);
    //
    // 若是你須要一個閾值，而且使用默認參數:
    //      cv::createLBPHFaceRecognizer(1,8,8,8,123.0)
    //
    Ptr<FaceRecognizer> model = createLBPHFaceRecognizer();
    model->train(images, labels);
    int predictedLabel = model->predict(testSample);
    //      int predictedLabel = -1;
    //      double confidence = 0.0;
    //      model->predict(testSample, predictedLabel, confidence);
    //
    string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel);
    cout << result_message << endl;
    // 有時你須要設置或者獲取內部數據模型,
    // 他不能被暴露在 cv::FaceRecognizer類中.
    //
    // 首先咱們對FaceRecognizer的閾值設置到0.0，而不是重寫訓練模型
    // 當你從新估計模型時很重要 
    //
    model->set("threshold",0.0);
    predictedLabel = model->predict(testSample);
    cout <<"Predicted class = "<< predictedLabel << endl;
    // 因爲確保高效率，LBP圖沒有被存儲在模型裏面。
    cout <<"Model Information:"<< endl;
    string model_info = format("tLBPH(radius=%i, neighbors=%i, grid_x=%i, grid_y=%i, threshold=%.2f)",
        model->getInt("radius"),
        model->getInt("neighbors"),
        model->getInt("grid_x"),
        model->getInt("grid_y"),
        model->getDouble("threshold"));
    cout << model_info << endl;
    // 咱們能夠獲取樣本的直方圖:
    vector<Mat> histograms = model->getMatVector("histograms");
    // 我須要現實它嗎? 或許它的長度纔是咱們感興趣的:
    cout <<"Size of the histograms: "<< histograms[0].total()<< endl;
    return 0;
}

main

程序中用一個CSV文件指明人臉數據庫文件及標籤，即CSV文件中每一行包含一個文件名路徑以後是其標籤值，中間以分號爲分隔符，能夠手工建立該CSV文件，固然也能夠用一個簡單的Python程序來幫你實現該文件，個人python腳本程序以下：

import sys
import os

def read_images(path, sz=None):
    c = 0
    X,y = [], []
    fp = open(os.path.join(path,"test.txt"),'w')
    for dirname, dirnames, filenames in os.walk(path):
        #print dirname
        #print dirnames
        #print filenames
        for subdirname in dirnames:
            subject_path = os.path.join(dirname, subdirname)
            for filename in os.listdir(subject_path):
                str = "%s;%d\n"%(os.path.join(subject_path, filename), c)
                print str
                fp.write(str)
            c += 1
    fp.close()

if __name__ == '__main__':
    read_images("F:\\mywork\\facerec_demo\\att_faces")