人臉識別是指將一個須要識別的人臉和人臉庫中的某我的臉對應起來(相似於指紋識別),目的是完成識別功能,該術語須要和人臉檢測進行區分,人臉檢測是在一張圖片中把人臉定位出來,完成的是搜尋的功能。從OpenCV2.4開始,加入了新的類FaceRecognizer,該類用於人臉識別,使用它能夠方便地進行相關識別實驗。html
原始的LBP算子定義爲在3*3的窗口內,以窗口中心像素爲閾值,將相鄰的8個像素的灰度值與其進行比較,若周圍像素值大於或等於中心像素值,則該像素點的位置被標記爲1,不然爲0。這樣,3*3鄰域內的8個點經比較可產生8位二進制數(一般轉換爲十進制數即LBP碼,共256種),即獲得該窗口中心像素點的LBP值,並用這個值來反映該區域的紋理特徵。以下圖所示:python
原始的LBP提出後,研究人員不斷對其提出了各類改進和優化。ios
基本的 LBP算子的最大缺陷在於它只覆蓋了一個固定半徑範圍內的小區域,這顯然不能知足不一樣尺寸和頻率紋理的須要。爲了適應不一樣尺度的紋理特徵,Ojala等對LBP算子進行了改進,將3×3鄰域擴展到任意鄰域,並用圓形鄰域代替了正方形鄰域,改進後的LBP算子容許在半徑爲R的圓形鄰域內有任意多個像素點,從而獲得了諸如半徑爲R的圓形區域內含有P個採樣點的LBP算子,OpenCV中正是使用圓形LBP算子,下圖示意了圓形LBP算子:數據庫
從LBP的定義能夠看出,LBP算子是灰度不變的,但卻不是旋轉不變的,圖像的旋轉就會獲得不一樣的LBP值。Maenpaa等人又將LBP算子進行了擴展,提出了具備旋轉不變性的LBP算子,即不斷旋轉圓形鄰域獲得一系列初始定義的LBP值,取其最小值做爲該鄰域的LBP值。下圖給出了求取旋轉不變LBP的過程示意圖,圖中算子下方的數字表示該算子對應的LBP值,圖中所示的8種LBP模式,通過旋轉不變的處理,最終獲得的具備旋轉不變性的LBP值爲15。也就是說,圖中的8種LBP模式對應的旋轉不變的LBP碼值都是00001111。數組
一個LBP算子能夠產生不一樣的二進制模式,對於半徑爲R的圓形區域內含有P個採樣點的LBP算子將會產生P2種模式。很顯然,隨着鄰域集內採樣點數的增長,二進制模式的種類是急劇增長的。例如:5×5鄰域內20個採樣點,有220=1,048,576種二進制模式。如此多的二值模式不管對於紋理的提取仍是對於紋理的識別、分類及信息的存取都是不利的。爲了解決二進制模式過多的問題,提升統計性,Ojala提出了採用一種「等價模式」(Uniform Pattern)來對LBP算子的模式種類進行降維。Ojala等認爲,在實際圖像中,絕大多數LBP模式最多隻包含兩次從1到0或從0到1的跳變。所以,Ojala將「等價模式」定義爲:當某個局部二進制模式所對應的循環二進制數從0到1或從1到0最多有兩次跳變時,該局部二進制模式所對應的二進制就成爲一個等價模式類。如00000000(0次跳變),00000111(含一次從0到1的跳變和一次1到0的跳變),10001111(先由1跳到0,再由0跳到1,共兩次跳變)都是等價模式類。除等價模式類之外的模式都歸爲另外一類,稱爲混合模式類,例如10010111(共四次跳變)。app
經過這樣的改進,二進制模式的種類大大減小,模式數量由原來的2P種減小爲P(P-1)+2+1種,其中P表示鄰域集內的採樣點數,等價模式類包含P(P-1)+2種模式,混合模式類只有1種模式。對於3×3鄰域內8個採樣點來講,二進制模式由原始的256種減小爲59種,這使得特徵向量的維數更少,而且能夠減小高頻噪聲帶來的影響。curl
顯而易見的是,上述提取的LBP算子在每一個像素點均可以獲得一個LBP「編碼」,那麼,對一幅圖像(記錄的是每一個像素點的灰度值)提取其原始的LBP算子以後,獲得的原始LBP特徵依然是「一幅圖片」(記錄的是每一個像素點的LBP值),如圖所示:ide
若是將以上獲得的LBP圖直接用於人臉識別,其實和不提取LBP特徵沒什麼區別,在實際的LBP應用中通常採用LBP特徵譜的統計直方圖做爲特徵向量進行分類識別,而且能夠將一幅圖片劃分爲若干的子區域,對每一個子區域內的每一個像素點都提取LBP特徵,而後,在每一個子區域內創建LBP特徵的統計直方圖。如此一來,每一個子區域,就能夠用一個統計直方圖來進行描述,整個圖片就由若干個統計直方圖組成,這樣作的好處是在必定範圍內減少圖像沒徹底對準而產生的偏差,分區的另一個意義在於咱們能夠根據不一樣的子區域給予不一樣的權重,好比說咱們認爲中心部分分區的權重大於邊緣部分分區的權重,意思就是說中心部分在進行圖片匹配識別時的意義更爲重大。 例如:一幅100*100像素大小的圖片,劃分爲10*10=100個子區域(能夠經過多種方式來劃分區域),每一個子區域的大小爲10*10像素;在每一個子區域內的每一個像素點,提取其LBP特徵,而後,創建統計直方圖;這樣,這幅圖片就有10*10個子區域,也就有了10*10個統計直方圖,利用這10*10個統計直方圖,就能夠描述這幅圖片了。以後,咱們利用各類類似性度量函數,就能夠判斷兩幅圖像之間的類似性了,OpenCV在LBP人臉識別中使用的是以下類似度公式:函數
以OpenCV2.4.9爲例,LBPH類源碼該文件——opencv2.4.9\sources\modules\contrib\src\facerec.cpp中,如LBPH類建立函數的聲明及實現以下:post
CV_EXPORTS_W Ptr<FaceRecognizer> createLBPHFaceRecognizer(int radius=1, int neighbors=8,int grid_x=8, int grid_y=8, double threshold = DBL_MAX); Ptr<FaceRecognizer> createLBPHFaceRecognizer(int radius, int neighbors,int grid_x, int grid_y, double threshold) { return new LBPH(radius, neighbors, grid_x, grid_y, threshold); }
由代碼可見LBPH使用圓形LBP算子,默認狀況下,圓的半徑是1,採樣點P爲8,x方向和y方向上的分區個數都爲8,即有8*8=64個分區,最後一個參數爲類似度閾值,待識別圖像也圖像庫中圖像類似度小於該值時纔會產生匹配結果。對於LBPH類咱們首先看一下其訓練過程函數train:
void LBPH::train(InputArrayOfArrays _in_src, InputArray _in_labels, bool preserveData) { if(_in_src.kind() != _InputArray::STD_VECTOR_MAT && _in_src.kind() != _InputArray::STD_VECTOR_VECTOR) { string error_message = "The images are expected as InputArray::STD_VECTOR_MAT (a std::vector<Mat>) or _InputArray::STD_VECTOR_VECTOR (a std::vector< vector<...> >)."; CV_Error(CV_StsBadArg, error_message); } if(_in_src.total() == 0) { string error_message = format("Empty training data was given. You'll need more than one sample to learn a model."); CV_Error(CV_StsUnsupportedFormat, error_message); } else if(_in_labels.getMat().type() != CV_32SC1) { string error_message = format("Labels must be given as integer (CV_32SC1). Expected %d, but was %d.", CV_32SC1, _in_labels.type()); CV_Error(CV_StsUnsupportedFormat, error_message); } // get the vector of matrices vector<Mat> src; _in_src.getMatVector(src); // get the label matrix Mat labels = _in_labels.getMat(); // check if data is well- aligned if(labels.total() != src.size()) { string error_message = format("The number of samples (src) must equal the number of labels (labels). Was len(samples)=%d, len(labels)=%d.", src.size(), _labels.total()); CV_Error(CV_StsBadArg, error_message); } // if this model should be trained without preserving old data, delete old model data if(!preserveData) { _labels.release(); _histograms.clear(); } // append labels to _labels matrix for(size_t labelIdx = 0; labelIdx < labels.total(); labelIdx++) { _labels.push_back(labels.at<int>((int)labelIdx)); } // store the spatial histograms of the original data for(size_t sampleIdx = 0; sampleIdx < src.size(); sampleIdx++) { // calculate lbp image Mat lbp_image = elbp(src[sampleIdx], _radius, _neighbors); // get spatial histogram from this lbp image Mat p = spatial_histogram( lbp_image, /* lbp_image */ static_cast<int>(std::pow(2.0, static_cast<double>(_neighbors))), /* number of possible patterns */ _grid_x, /* grid size x */ _grid_y, /* grid size y */ true); // add to templates _histograms.push_back(p); } }
由代碼可見LBPH使用圓形LBP算子,默認狀況下,圓的半徑是1,採樣點P爲8,x方向和y方向上的分區個數都爲8,即有8*8=64個分區,最後一個參數爲類似度閾值,待識別圖像也圖像庫中圖像類似度小於該值時纔會產生匹配結果。對於LBPH類咱們首先看一下其訓練過程函數train:
template <typename _Tp> static inline void elbp_(InputArray _src, OutputArray _dst, int radius, int neighbors) { //get matrices Mat src = _src.getMat(); // allocate memory for result _dst.create(src.rows-2*radius, src.cols-2*radius, CV_32SC1); Mat dst = _dst.getMat(); // zero dst.setTo(0); for(int n=0; n<neighbors; n++) { // sample points float x = static_cast<float>(radius * cos(2.0*CV_PI*n/static_cast<float>(neighbors))); float y = static_cast<float>(-radius * sin(2.0*CV_PI*n/static_cast<float>(neighbors))); // relative indices int fx = static_cast<int>(floor(x)); int fy = static_cast<int>(floor(y)); int cx = static_cast<int>(ceil(x)); int cy = static_cast<int>(ceil(y)); // fractional part float ty = y - fy; float tx = x - fx; // set interpolation weights float w1 = (1 - tx) * (1 - ty); float w2 = tx * (1 - ty); float w3 = (1 - tx) * ty; float w4 = tx * ty; // iterate through your data for(int i=radius; i < src.rows-radius;i++) { for(int j=radius;j < src.cols-radius;j++) { // calculate interpolated value float t = static_cast<float>(w1*src.at<_Tp>(i+fy,j+fx) + w2*src.at<_Tp>(i+fy,j+cx) + w3*src.at<_Tp>(i+cy,j+fx) + w4*src.at<_Tp>(i+cy,j+cx)); // floating point precision, so check some machine-dependent epsilon dst.at<int>(i-radius,j-radius) += ((t > src.at<_Tp>(i,j)) || (std::abs(t-src.at<_Tp>(i,j)) < std::numeric_limits<float>::epsilon())) << n; } } } } static void elbp(InputArray src, OutputArray dst, int radius, int neighbors) { int type = src.type(); switch (type) { case CV_8SC1: elbp_<char>(src,dst, radius, neighbors); break; case CV_8UC1: elbp_<unsigned char>(src, dst, radius, neighbors); break; case CV_16SC1: elbp_<short>(src,dst, radius, neighbors); break; case CV_16UC1: elbp_<unsigned short>(src,dst, radius, neighbors); break; case CV_32SC1: elbp_<int>(src,dst, radius, neighbors); break; case CV_32FC1: elbp_<float>(src,dst, radius, neighbors); break; case CV_64FC1: elbp_<double>(src,dst, radius, neighbors); break; default: string error_msg = format("Using Original Local Binary Patterns for feature extraction only works on single-channel images (given %d). Please pass the image data as a grayscale image!", type); CV_Error(CV_StsNotImplemented, error_msg); break; } } static Mat histc_(const Mat& src, int minVal=0, int maxVal=255, bool normed=false) { Mat result; // Establish the number of bins. int histSize = maxVal-minVal+1; // Set the ranges. float range[] = { static_cast<float>(minVal), static_cast<float>(maxVal+1) }; const float* histRange = { range }; // calc histogram calcHist(&src, 1, 0, Mat(), result, 1, &histSize, &histRange, true, false); // normalize if(normed) { result /= (int)src.total(); } return result.reshape(1,1); } static Mat histc(InputArray _src, int minVal, int maxVal, bool normed) { Mat src = _src.getMat(); switch (src.type()) { case CV_8SC1: return histc_(Mat_<float>(src), minVal, maxVal, normed); break; case CV_8UC1: return histc_(src, minVal, maxVal, normed); break; case CV_16SC1: return histc_(Mat_<float>(src), minVal, maxVal, normed); break; case CV_16UC1: return histc_(src, minVal, maxVal, normed); break; case CV_32SC1: return histc_(Mat_<float>(src), minVal, maxVal, normed); break; case CV_32FC1: return histc_(src, minVal, maxVal, normed); break; default: CV_Error(CV_StsUnmatchedFormats, "This type is not implemented yet."); break; } return Mat(); } static Mat spatial_histogram(InputArray _src, int numPatterns, int grid_x, int grid_y, bool /*normed*/) { Mat src = _src.getMat(); // calculate LBP patch size int width = src.cols/grid_x; int height = src.rows/grid_y; // allocate memory for the spatial histogram Mat result = Mat::zeros(grid_x * grid_y, numPatterns, CV_32FC1); // return matrix with zeros if no data was given if(src.empty()) return result.reshape(1,1); // initial result_row int resultRowIdx = 0; // iterate through grid for(int i = 0; i < grid_y; i++) { for(int j = 0; j < grid_x; j++) { Mat src_cell = Mat(src, Range(i*height,(i+1)*height), Range(j*width,(j+1)*width)); Mat cell_hist = histc(src_cell, 0, (numPatterns-1), true); // copy to the result matrix Mat result_row = result.row(resultRowIdx); cell_hist.reshape(1,1).convertTo(result_row, CV_32FC1); // increase row count in result matrix resultRowIdx++; } } // return result as reshaped feature vector return result.reshape(1,1); } //------------------------------------------------------------------------------ // wrapper to cv::elbp (extended local binary patterns) //------------------------------------------------------------------------------ static Mat elbp(InputArray src, int radius, int neighbors) { Mat dst; elbp(src, dst, radius, neighbors); return dst; }
須要注意的是在求圖像中每一個位置的8個採樣點的值時,是使用的採樣點四個角上相應位置的加權平均值才做爲採樣點的值(見上面函數elbp_中12~35行處代碼),這樣作能下降噪音點對LBP值的影響。而spatial_histogram函數把最後的分區直方圖結果reshape成一行,這樣作能方便識別時的類似度計算。識別函數有predict函數實現,源代碼以下:
void LBPH::predict(InputArray _src, int &minClass, double &minDist) const { if(_histograms.empty()) { // throw error if no data (or simply return -1?) string error_message = "This LBPH model is not computed yet. Did you call the train method?"; CV_Error(CV_StsBadArg, error_message); } Mat src = _src.getMat(); // get the spatial histogram from input image Mat lbp_image = elbp(src, _radius, _neighbors); Mat query = spatial_histogram( lbp_image, /* lbp_image */ static_cast<int>(std::pow(2.0, static_cast<double>(_neighbors))), /* number of possible patterns */ _grid_x, /* grid size x */ _grid_y, /* grid size y */ true /* normed histograms */); // find 1-nearest neighbor minDist = DBL_MAX; minClass = -1; for(size_t sampleIdx = 0; sampleIdx < _histograms.size(); sampleIdx++) { double dist = compareHist(_histograms[sampleIdx], query, CV_COMP_CHISQR); if((dist < minDist) && (dist < _threshold)) { minDist = dist; minClass = _labels.at<int>((int) sampleIdx); } } }
函數中7~15行是計算帶預測圖片_src的分區直方圖query,19~25行的for循環分別比較query和人臉庫直方圖數組_histograms中每個直方圖的類似度(比較方法正是CV_COMP_CHISQR),並把類似度最小的做爲最終結果,該部分也能夠當作建立LBPH類時threshold的做用,即類似度都不小於threshold閾值則識別失敗。
最後給出LBP人臉識別的示例代碼,代碼中使用的人臉庫是AT&T人臉庫(又稱ORL人臉數據庫),庫中有40我的,每人10張照片,共400張人臉照片。示例代碼以下:
#include "opencv2/core/core.hpp" #include "opencv2/highgui/highgui.hpp" #include "opencv2/contrib/contrib.hpp" #define CV_VERSION_ID CVAUX_STR(CV_MAJOR_VERSION) CVAUX_STR(CV_MINOR_VERSION) CVAUX_STR(CV_SUBMINOR_VERSION) #ifdef _DEBUG #define cvLIB(name) "opencv_" name CV_VERSION_ID "d" #else #define cvLIB(name) "opencv_" name CV_VERSION_ID #endif #pragma comment( lib, cvLIB("core") ) #pragma comment( lib, cvLIB("imgproc") ) #pragma comment( lib, cvLIB("highgui") ) #pragma comment( lib, cvLIB("flann") ) #pragma comment( lib, cvLIB("features2d") ) #pragma comment( lib, cvLIB("calib3d") ) #pragma comment( lib, cvLIB("gpu") ) #pragma comment( lib, cvLIB("legacy") ) #pragma comment( lib, cvLIB("ml") ) #pragma comment( lib, cvLIB("objdetect") ) #pragma comment( lib, cvLIB("ts") ) #pragma comment( lib, cvLIB("video") ) #pragma comment( lib, cvLIB("contrib") ) #pragma comment( lib, cvLIB("nonfree") ) #include <iostream> #include <fstream> #include <sstream> using namespace cv; using namespace std; static void read_csv(const string& filename, vector<Mat>& images, vector<int>& labels, char separator =';') { std::ifstream file(filename.c_str(), ifstream::in); if (!file) { string error_message ="No valid input file was given, please check the given filename."; CV_Error(CV_StsBadArg, error_message); } string line, path, classlabel; while (getline(file, line)) { stringstream liness(line); getline(liness, path, separator); getline(liness, classlabel); if(!path.empty()&&!classlabel.empty()) { images.push_back(imread(path, 0)); labels.push_back(atoi(classlabel.c_str())); } } } int main(int argc, const char *argv[]) { if (argc !=2) { cout <<"usage: "<< argv[0]<<" <csv.ext>"<< endl; exit(1); } string fn_csv = string(argv[1]); vector<Mat> images; vector<int> labels; try { read_csv(fn_csv, images, labels); } catch (cv::Exception& e) { cerr <<"Error opening file "<< fn_csv <<". Reason: "<< e.msg << endl; // nothing more we can do exit(1); } if(images.size()<=1) { string error_message ="This demo needs at least 2 images to work. Please add more images to your data set!"; CV_Error(CV_StsError, error_message); } int height = images[0].rows; Mat testSample = images[images.size() -1]; int testLabel = labels[labels.size() -1]; images.pop_back(); labels.pop_back(); // TLBPHFaceRecognizer 使用了擴展的LBP // 在其餘的算子中他可能很容易被擴展 // 下面是默認參數 // radius = 1 // neighbors = 8 // grid_x = 8 // grid_y = 8 // // 若是你要建立 LBPH FaceRecognizer 半徑是2,16個鄰域 // cv::createLBPHFaceRecognizer(2, 16); // // 若是你須要一個閾值,而且使用默認參數: // cv::createLBPHFaceRecognizer(1,8,8,8,123.0) // Ptr<FaceRecognizer> model = createLBPHFaceRecognizer(); model->train(images, labels); int predictedLabel = model->predict(testSample); // int predictedLabel = -1; // double confidence = 0.0; // model->predict(testSample, predictedLabel, confidence); // string result_message = format("Predicted class = %d / Actual class = %d.", predictedLabel, testLabel); cout << result_message << endl; // 有時你須要設置或者獲取內部數據模型, // 他不能被暴露在 cv::FaceRecognizer類中. // // 首先咱們對FaceRecognizer的閾值設置到0.0,而不是重寫訓練模型 // 當你從新估計模型時很重要 // model->set("threshold",0.0); predictedLabel = model->predict(testSample); cout <<"Predicted class = "<< predictedLabel << endl; // 因爲確保高效率,LBP圖沒有被存儲在模型裏面。 cout <<"Model Information:"<< endl; string model_info = format("tLBPH(radius=%i, neighbors=%i, grid_x=%i, grid_y=%i, threshold=%.2f)", model->getInt("radius"), model->getInt("neighbors"), model->getInt("grid_x"), model->getInt("grid_y"), model->getDouble("threshold")); cout << model_info << endl; // 咱們能夠獲取樣本的直方圖: vector<Mat> histograms = model->getMatVector("histograms"); // 我須要現實它嗎? 或許它的長度纔是咱們感興趣的: cout <<"Size of the histograms: "<< histograms[0].total()<< endl; return 0; }
程序中用一個CSV文件指明人臉數據庫文件及標籤,即CSV文件中每一行包含一個文件名路徑以後是其標籤值,中間以分號爲分隔符,能夠手工建立該CSV文件,固然也能夠用一個簡單的Python程序來幫你實現該文件,個人python腳本程序以下:
import sys import os def read_images(path, sz=None): c = 0 X,y = [], [] fp = open(os.path.join(path,"test.txt"),'w') for dirname, dirnames, filenames in os.walk(path): #print dirname #print dirnames #print filenames for subdirname in dirnames: subject_path = os.path.join(dirname, subdirname) for filename in os.listdir(subject_path): str = "%s;%d\n"%(os.path.join(subject_path, filename), c) print str fp.write(str) c += 1 fp.close() if __name__ == '__main__': read_images("F:\\mywork\\facerec_demo\\att_faces")
程序中22行需改爲你本身的人臉庫路徑。
示例程序的運行結果以下所示:
結果第二行反應了當設置閾值爲0.0時(model->set("threshold",0.0)),則不會有識別結果產生。
示例程序(包含人臉庫)下載地址:http://download.csdn.net/detail/weiwei22844/9557242
本博客參考了以下博文,一併致謝!