EasyPR--開發詳解（8）文字定位

時間 2019-11-09

原文原文鏈接

　　今天咱們來介紹車牌定位中的一種新方法--文字定位方法（MSER），包括其主要設計思想與實現。接着咱們會介紹一下EasyPR v1.5-beta版本中帶來的幾項改動。html

一. 文字定位法git

　　在EasyPR前面幾個版本中，最爲人所詬病的就是定位效果不佳，尤爲是在面對生活場景（例如手機拍攝）時。因爲EasyPR最先的數據來源於卡口，所以對卡口數據進行了優化，而並無對生活場景中圖片有較好處理的策略。後來一個版本（v1.3）增長了顏色定位方法，改善了這種現象，可是對分辨率較大的圖片處理仍然很差。再加上顏色定位在面對低光照，低對比度的圖像時處理效果大幅度降低，顏色自己也是一個不穩定的特徵。所以EasyPR的車牌定位的總體魯棒性仍然不足。github

　　針對這種現象，EasyPR v1.5增長了一種新的定位方法，文字定位方法，大幅度改善了這些問題。下面幾幅圖能夠說明文字定位法的效果。算法

圖1 夜間的車牌圖像（左），圖2 對比度很是低的圖像（右）編程

圖3 近距離的圖像（左），圖4 高分辨率的圖像（右）多線程

　　圖1是夜間的車牌圖像，圖2是對比度很是低的圖像，圖3是很是近距離拍攝的圖像，圖4則是高分辨率（3200寬）的圖像。less

　　文字定位方法是採用了低級過濾器提取文字，而後再將其組合的一種定位方法。原先是利用在場景中定位文字，在這裏利用其定位車牌。與在掃描文檔中的文字不一樣，天然場景中的文字具備低對比度，背景各異，光亮干擾較多等狀況，所以須要一個極爲魯棒的方法去提取出來。目前業界用的較多的是MSER（最大穩定極值區域）方法。EasyPR使用的是MSER的一個改良方法，專門針對文字進行了優化。在文字定位出來之後，通常須要用一個分類器將其中大部分的定位錯誤的文字去掉，例如ANN模型。爲了得到最終的車牌，這些文字須要組合起來。因爲實際狀況的複雜，簡單的使用普通的聚類效果每每很差，所以EasyPR使用了一種魯棒性較強的種子生長方法（seed growing）去組合。ide

　　我在這裏簡單介紹一下具體的實現。關於方法的細節能夠看代碼，有不少的註釋（代碼可能較長）。關於方法的思想能夠看附錄的兩篇論文。函數

  1 //! use verify size to first generate char candidates
  2 void mserCharMatch(const Mat &src, std::vector<Mat> &match, std::vector<CPlate>& out_plateVec_blue, std::vector<CPlate>& out_plateVec_yellow,
  3   bool usePlateMser, std::vector<RotatedRect>& out_plateRRect_blue, std::vector<RotatedRect>& out_plateRRect_yellow, int img_index,
  4   bool showDebug) {
  5   Mat image = src;
  6 
  7   std::vector<std::vector<std::vector<Point>>> all_contours;
  8   std::vector<std::vector<Rect>> all_boxes;
  9   all_contours.resize(2);
 10   all_contours.at(0).reserve(1024);
 11   all_contours.at(1).reserve(1024);
 12   all_boxes.resize(2);
 13   all_boxes.at(0).reserve(1024);
 14   all_boxes.at(1).reserve(1024);
 15 
 16   match.resize(2);
 17 
 18   std::vector<Color> flags;
 19   flags.push_back(BLUE);
 20   flags.push_back(YELLOW);
 21 
 22   const int imageArea = image.rows * image.cols;
 23   const int delta = 1;
 24   //const int delta = CParams::instance()->getParam2i();;
 25   const int minArea = 30;
 26   const double maxAreaRatio = 0.05;
 27 
 28   Ptr<MSER2> mser;
 29   mser = MSER2::create(delta, minArea, int(maxAreaRatio * imageArea));
 30   mser->detectRegions(image, all_contours.at(0), all_boxes.at(0), all_contours.at(1), all_boxes.at(1));
 31 
 32   // mser detect 
 33   // color_index = 0 : mser-, detect white characters, which is in blue plate.
 34   // color_index = 1 : mser+, detect dark characters, which is in yellow plate.
 35 
 36 #pragma omp parallel for
 37   for (int color_index = 0; color_index < 2; color_index++) {
 38     Color the_color = flags.at(color_index);
 39 
 40     std::vector<CCharacter> charVec;
 41     charVec.reserve(128);
 42 
 43     match.at(color_index) = Mat::zeros(image.rows, image.cols, image.type());
 44 
 45     Mat result = image.clone();
 46     cvtColor(result, result, COLOR_GRAY2BGR);
 47 
 48     size_t size = all_contours.at(color_index).size();
 49 
 50     int char_index = 0;
 51     int char_size = 20;
 52 
 53     // Chinese plate has max 7 characters.
 54     const int char_max_count = 7;
 55 
 56     // verify char size and output to rects;
 57     for (size_t index = 0; index < size; index++) {
 58       Rect rect = all_boxes.at(color_index)[index];
 59       std::vector<Point>& contour = all_contours.at(color_index)[index];
 60 
 61       // sometimes a plate could be a mser rect, so we could
 62       // also use mser algorithm to find plate
 63       if (usePlateMser) {
 64         RotatedRect rrect = minAreaRect(Mat(contour));
 65         if (verifyRotatedPlateSizes(rrect)) {
 66           //rotatedRectangle(result, rrect, Scalar(255, 0, 0), 2);
 67           if (the_color == BLUE) out_plateRRect_blue.push_back(rrect);
 68           if (the_color == YELLOW) out_plateRRect_yellow.push_back(rrect);
 69         }
 70       }
 71 
 72       // find character
 73       if (verifyCharSizes(rect)) {
 74         Mat mserMat = adaptive_image_from_points(contour, rect, Size(char_size, char_size));
 75         Mat charInput = preprocessChar(mserMat, char_size);
 76         Rect charRect = rect;
 77 
 78         Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2);
 79         Mat tmpMat;
 80         double ostu_level = cv::threshold(image(charRect), tmpMat, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
 81 
 82         //cv::circle(result, center, 3, Scalar(0, 0, 255), 2);
 83 
 84         // use judegMDOratio2 function to
 85         // remove the small lines in character like "zh-cuan"
 86         if (judegMDOratio2(image, rect, contour, result)) {
 87           CCharacter charCandidate;
 88           charCandidate.setCharacterPos(charRect);
 89           charCandidate.setCharacterMat(charInput);
 90           charCandidate.setOstuLevel(ostu_level);
 91           charCandidate.setCenterPoint(center);
 92           charCandidate.setIsChinese(false);
 93           charVec.push_back(charCandidate);
 94         }
 95       }
 96     }
 97 
 98     // improtant, use matrix multiplication to acclerate the 
 99     // classification of many samples. use the character 
100     // score, we can use non-maximum superssion (nms) to 
101     // reduce the characters which are not likely to be true
102     // charaters, and use the score to select the strong seed
103     // of which the score is larger than 0.9
104     CharsIdentify::instance()->classify(charVec);
105 
106     // use nms to remove the character are not likely to be true.
107     double overlapThresh = 0.6;
108     //double overlapThresh = CParams::instance()->getParam1f();
109     NMStoCharacter(charVec, overlapThresh);
110     charVec.shrink_to_fit();
111 
112     std::vector<CCharacter> strongSeedVec;
113     strongSeedVec.reserve(64);
114     std::vector<CCharacter> weakSeedVec;
115     weakSeedVec.reserve(64);
116     std::vector<CCharacter> littleSeedVec;
117     littleSeedVec.reserve(64);
118 
119     //size_t charCan_size = charVec.size();
120     for (auto charCandidate : charVec) {
121       //CCharacter& charCandidate = charVec[char_index];
122       Rect rect = charCandidate.getCharacterPos();
123       double score = charCandidate.getCharacterScore();
124       if (charCandidate.getIsStrong()) {
125         strongSeedVec.push_back(charCandidate);
126       }
127       else if (charCandidate.getIsWeak()) {
128         weakSeedVec.push_back(charCandidate);
129         //cv::rectangle(result, rect, Scalar(255, 0, 255));
130       }
131       else if (charCandidate.getIsLittle()) {
132         littleSeedVec.push_back(charCandidate);
133         //cv::rectangle(result, rect, Scalar(255, 0, 255));
134       }
135     }
136 
137     std::vector<CCharacter> searchCandidate = charVec;
138 
139     // nms to srong seed, only leave the strongest one
140     overlapThresh = 0.3;
141     NMStoCharacter(strongSeedVec, overlapThresh);
142 
143     // merge chars to group
144     std::vector<std::vector<CCharacter>> charGroupVec;
145     charGroupVec.reserve(64);
146     mergeCharToGroup(strongSeedVec, charGroupVec);
147 
148     // genenrate the line of the group
149     // based on the assumptions , the mser rects which are 
150     // given high socre by character classifier could be no doubtly
151     // be the characters in one plate, and we can use these characeters
152     // to fit a line which is the middle line of the plate.
153     std::vector<CPlate> plateVec;
154     plateVec.reserve(16);
155     for (auto charGroup : charGroupVec) {
156       Rect plateResult = charGroup[0].getCharacterPos();
157       std::vector<Point> points;
158       points.reserve(32);
159 
160       Vec4f line;
161       int maxarea = 0;
162       Rect maxrect;
163       double ostu_level_sum = 0;
164 
165       int leftx = image.cols;
166       Point leftPoint(leftx, 0);
167       int rightx = 0;
168       Point rightPoint(rightx, 0);
169 
170       std::vector<CCharacter> mserCharVec;
171       mserCharVec.reserve(32);
172 
173       // remove outlier CharGroup
174       std::vector<CCharacter> roCharGroup;
175       roCharGroup.reserve(32);
176 
177       removeRightOutliers(charGroup, roCharGroup, 0.2, 0.5, result);
178       //roCharGroup = charGroup;
179 
180       for (auto character : roCharGroup) {
181         Rect charRect = character.getCharacterPos();
182         cv::rectangle(result, charRect, Scalar(0, 255, 0), 1);
183         plateResult |= charRect;
184 
185         Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2);
186         points.push_back(center);
187         mserCharVec.push_back(character);
188         //cv::circle(result, center, 3, Scalar(0, 255, 0), 2);
189 
190         ostu_level_sum += character.getOstuLevel();
191 
192         if (charRect.area() > maxarea) {
193           maxrect = charRect;
194           maxarea = charRect.area();
195         }
196         if (center.x < leftPoint.x) {
197           leftPoint = center;
198         }
199         if (center.x > rightPoint.x) {
200           rightPoint = center;
201         }
202       }
203 
204       double ostu_level_avg = ostu_level_sum / (double)roCharGroup.size();
205       if (1 && showDebug) {
206         std::cout << "ostu_level_avg:" << ostu_level_avg << std::endl;
207       }
208       float ratio_maxrect = (float)maxrect.width / (float)maxrect.height;
209 
210       if (points.size() >= 2 && ratio_maxrect >= 0.3) {
211         fitLine(Mat(points), line, CV_DIST_L2, 0, 0.01, 0.01);
212 
213         float k = line[1] / line[0];
214         //float angle = atan(k) * 180 / (float)CV_PI;
215         //std::cout << "k:" << k << std::endl;
216         //std::cout << "angle:" << angle << std::endl;
217         //std::cout << "cos:" << 0.3 * cos(k) << std::endl;
218         //std::cout << "ratio_maxrect:" << ratio_maxrect << std::endl;
219 
220         std::sort(mserCharVec.begin(), mserCharVec.end(),
221           [](const CCharacter& r1, const CCharacter& r2) {
222           return r1.getCharacterPos().tl().x < r2.getCharacterPos().tl().x;
223         });
224 
225         CCharacter midChar = mserCharVec.at(int(mserCharVec.size() / 2.f));
226         Rect midRect = midChar.getCharacterPos();
227         Point midCenter(midRect.tl().x + midRect.width / 2, midRect.tl().y + midRect.height / 2);
228 
229         int mindist = 7 * maxrect.width;
230         std::vector<Vec2i> distVecVec;
231         distVecVec.reserve(32);
232 
233         Vec2i mindistVec;
234         Vec2i avgdistVec;
235 
236         // computer the dist which is the distacne between 
237         // two near characters in the plate, use dist we can
238         // judege how to computer the max search range, and choose the
239         // best location of the sliding window in the next steps.
240         for (size_t mser_i = 0; mser_i + 1 < mserCharVec.size(); mser_i++) {
241           Rect charRect = mserCharVec.at(mser_i).getCharacterPos();
242           Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2);
243 
244           Rect charRectCompare = mserCharVec.at(mser_i + 1).getCharacterPos();
245           Point centerCompare(charRectCompare.tl().x + charRectCompare.width / 2,
246             charRectCompare.tl().y + charRectCompare.height / 2);
247 
248           int dist = charRectCompare.x - charRect.x;
249           Vec2i distVec(charRectCompare.x - charRect.x, charRectCompare.y - charRect.y);
250           distVecVec.push_back(distVec);
251 
252           //if (dist < mindist) {
253           //  mindist = dist;
254           //  mindistVec = distVec;
255           //}
256         }
257 
258         std::sort(distVecVec.begin(), distVecVec.end(),
259           [](const Vec2i& r1, const Vec2i& r2) {
260           return r1[0] < r2[0];
261         });
262 
263         avgdistVec = distVecVec.at(int((distVecVec.size() - 1) / 2.f));
264 
265         //float step = 10.f * (float)maxrect.width;
266         //float step = (float)mindistVec[0];
267         float step = (float)avgdistVec[0];
268 
269         //cv::line(result, Point2f(line[2] - step, line[3] - k*step), Point2f(line[2] + step, k*step + line[3]), Scalar(255, 255, 255));
270         cv::line(result, Point2f(midCenter.x - step, midCenter.y - k*step), Point2f(midCenter.x + step, k*step + midCenter.y), Scalar(255, 255, 255));
271         //cv::circle(result, leftPoint, 3, Scalar(0, 0, 255), 2);
272 
273         CPlate plate;
274         plate.setPlateLeftPoint(leftPoint);
275         plate.setPlateRightPoint(rightPoint);
276 
277         plate.setPlateLine(line);
278         plate.setPlatDistVec(avgdistVec);
279         plate.setOstuLevel(ostu_level_avg);
280 
281         plate.setPlateMergeCharRect(plateResult);
282         plate.setPlateMaxCharRect(maxrect);
283         plate.setMserCharacter(mserCharVec);
284         plateVec.push_back(plate);
285       }
286     }
287 
288     // use strong seed to construct the first shape of the plate,
289     // then we need to find characters which are the weak seed.
290     // because we use strong seed to build the middle lines of the plate,
291     // we can simply use this to consider weak seeds only lie in the 
292     // near place of the middle line
293     for (auto plate : plateVec) {
294       Vec4f line = plate.getPlateLine();
295       Point leftPoint = plate.getPlateLeftPoint();
296       Point rightPoint = plate.getPlateRightPoint();
297 
298       Rect plateResult = plate.getPlateMergeCharRect();
299       Rect maxrect = plate.getPlateMaxCharRect();
300       Vec2i dist = plate.getPlateDistVec();
301       double ostu_level = plate.getOstuLevel();
302 
303       std::vector<CCharacter> mserCharacter = plate.getCopyOfMserCharacters();
304       mserCharacter.reserve(16);
305 
306       float k = line[1] / line[0];
307       float x_1 = line[2];
308       float y_1 = line[3];
309 
310       std::vector<CCharacter> searchWeakSeedVec;
311       searchWeakSeedVec.reserve(16);
312 
313       std::vector<CCharacter> searchRightWeakSeed;
314       searchRightWeakSeed.reserve(8);
315       std::vector<CCharacter> searchLeftWeakSeed;
316       searchLeftWeakSeed.reserve(8);
317 
318       std::vector<CCharacter> slideRightWindow;
319       slideRightWindow.reserve(8);
320       std::vector<CCharacter> slideLeftWindow;
321       slideLeftWindow.reserve(8);
322 
323       // draw weak seed and little seed from line;
324       // search for mser rect
325       if (1 && showDebug) {
326         std::cout << "search for mser rect:" << std::endl;
327       }
328 
329       if (0 && showDebug) {
330         std::stringstream ss(std::stringstream::in | std::stringstream::out);
331         ss << "resources/image/tmp/" << img_index << "_1_" << "searcgMserRect.jpg";
332         imwrite(ss.str(), result);
333       }
334       if (1 && showDebug) {
335         std::cout << "mserCharacter:" << mserCharacter.size() << std::endl;
336       }
337 
338       // if the count of strong seed is larger than max count, we dont need 
339       // all the next steps, if not, we first need to search the weak seed in 
340       // the same line as the strong seed. The judge condition contains the distance 
341       // between strong seed and weak seed , and the rect simily of each other to improve
342       // the roubustnedd of the seed growing algorithm.
343       if (mserCharacter.size() < char_max_count) {
344         double thresh1 = 0.15;
345         double thresh2 = 2.0;
346         searchWeakSeed(searchCandidate, searchRightWeakSeed, thresh1, thresh2, line, rightPoint,
347           maxrect, plateResult, result, CharSearchDirection::RIGHT);
348         if (1 && showDebug) {
349           std::cout << "searchRightWeakSeed:" << searchRightWeakSeed.size() << std::endl;
350         }
351         for (auto seed : searchRightWeakSeed) {
352           cv::rectangle(result, seed.getCharacterPos(), Scalar(255, 0, 0), 1);
353           mserCharacter.push_back(seed);
354         }
355 
356         searchWeakSeed(searchCandidate, searchLeftWeakSeed, thresh1, thresh2, line, leftPoint,
357           maxrect, plateResult, result, CharSearchDirection::LEFT);
358         if (1 && showDebug) {
359           std::cout << "searchLeftWeakSeed:" << searchLeftWeakSeed.size() << std::endl;
360         }
361         for (auto seed : searchLeftWeakSeed) {
362           cv::rectangle(result, seed.getCharacterPos(), Scalar(255, 0, 0), 1);
363           mserCharacter.push_back(seed);
364         }
365       }
366 
367       // sometimes the weak seed is in the middle of the strong seed.
368       // and sometimes two strong seed are actually the two parts of one character.
369       // because we only consider the weak seed in the left and right direction of strong seed.
370       // now we examine all the strong seed and weak seed. not only to find the seed in the middle,
371       // but also to combine two seed which are parts of one character to one seed.
372       // only by this process, we could use the seed count as the condition to judge if or not to use slide window.
373       float min_thresh = 0.3f;
374       float max_thresh = 2.5f;
375       reFoundAndCombineRect(mserCharacter, min_thresh, max_thresh, dist, maxrect, result);
376 
377       // if the characters count is less than max count
378       // this means the mser rect in the lines are not enough.
379       // sometimes there are still some characters could not be captured by mser algorithm,
380       // such as blur, low light ,and some chinese characters like zh-cuan.
381       // to handle this ,we use a simple slide window method to find them.
382       if (mserCharacter.size() < char_max_count) {
383         if (1 && showDebug) {
384           std::cout << "search chinese:" << std::endl;
385           std::cout << "judege the left is chinese:" << std::endl;
386         }
387 
388         // if the left most character is chinese, this means
389         // that must be the first character in chinese plate,
390         // and we need not to do a slide window to left. So,
391         // the first thing is to judge the left charcater is 
392         // or not the chinese.
393         bool leftIsChinese = false;
394         if (1) {
395           std::sort(mserCharacter.begin(), mserCharacter.end(),
396             [](const CCharacter& r1, const CCharacter& r2) {
397             return r1.getCharacterPos().tl().x < r2.getCharacterPos().tl().x;
398           });
399 
400           CCharacter leftChar = mserCharacter[0];
401 
402           //Rect theRect = adaptive_charrect_from_rect(leftChar.getCharacterPos(), image.cols, image.rows);
403           Rect theRect = leftChar.getCharacterPos();
404           //cv::rectangle(result, theRect, Scalar(255, 0, 0), 1);
405 
406           Mat region = image(theRect);
407           Mat binary_region;
408 
409           ostu_level = cv::threshold(region, binary_region, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU);
410           if (1 && showDebug) {
411             std::cout << "left : ostu_level:" << ostu_level << std::endl;
412           }
413           //plate.setOstuLevel(ostu_level);
414 
415           Mat charInput = preprocessChar(binary_region, char_size);
416           if (0 /*&& showDebug*/) {
417             imshow("charInput", charInput);
418             waitKey(0);
419             destroyWindow("charInput");
420           }
421 
422           std::string label = "";
423           float maxVal = -2.f;
424           leftIsChinese = CharsIdentify::instance()->isCharacter(charInput, label, maxVal, true);
425           //auto character = CharsIdentify::instance()->identifyChinese(charInput, maxVal, leftIsChinese);
426           //label = character.second;
427           if (0 /* && showDebug*/) {
428             std::cout << "isChinese:" << leftIsChinese << std::endl;
429             std::cout << "chinese:" << label;
430             std::cout << "__score:" << maxVal << std::endl;
431           }
432         }
433 
434         // if the left most character is not a chinese,
435         // this means we meed to slide a window to find the missed mser rect.
436         // search for sliding window
437         float ratioWindow  = 0.4f;
438         //float ratioWindow = CParams::instance()->getParam3f();
439         float threshIsCharacter = 0.8f;
440         //float threshIsCharacter = CParams::instance()->getParam3f();
441         if (!leftIsChinese) {
442           slideWindowSearch(image, slideLeftWindow, line, leftPoint, dist, ostu_level, ratioWindow, threshIsCharacter,
443             maxrect, plateResult, CharSearchDirection::LEFT, true, result);
444           if (1 && showDebug) {
445             std::cout << "slideLeftWindow:" << slideLeftWindow.size() << std::endl;
446           }
447           for (auto window : slideLeftWindow) {
448             cv::rectangle(result, window.getCharacterPos(), Scalar(0, 0, 255), 1);
449             mserCharacter.push_back(window);
450           }
451         }
452       }
453 
454       // if we still have less than max count characters,
455       // we need to slide a window to right to search for the missed mser rect.
456       if (mserCharacter.size() < char_max_count) {
457         // change ostu_level
458         float ratioWindow  = 0.4f;
459         //float ratioWindow = CParams::instance()->getParam3f();
460         float threshIsCharacter = 0.8f;
461         //float threshIsCharacter = CParams::instance()->getParam3f();
462         slideWindowSearch(image, slideRightWindow, line, rightPoint, dist, plate.getOstuLevel(), ratioWindow, threshIsCharacter,
463           maxrect, plateResult, CharSearchDirection::RIGHT, false, result);
464         if (1 && showDebug) {
465           std::cout << "slideRightWindow:" << slideRightWindow.size() << std::endl;
466         }
467         for (auto window : slideRightWindow) {
468           cv::rectangle(result, window.getCharacterPos(), Scalar(0, 0, 255), 1);
469           mserCharacter.push_back(window);
470         }
471       }
472 
473       // computer the plate angle
474       float angle = atan(k) * 180 / (float)CV_PI;
475       if (1 && showDebug) {
476         std::cout << "k:" << k << std::endl;
477         std::cout << "angle:" << angle << std::endl;
478       }
479 
480       // the plateResult rect need to be enlarge to contains all the plate,
481       // not only the character area.
482       float widthEnlargeRatio = 1.15f;
483       float heightEnlargeRatio = 1.25f;
484       RotatedRect platePos(Point2f((float)plateResult.x + plateResult.width / 2.f, (float)plateResult.y + plateResult.height / 2.f),
485         Size2f(plateResult.width * widthEnlargeRatio, maxrect.height * heightEnlargeRatio), angle);
486 
487       // justify the size is likely to be a plate size.
488       if (verifyRotatedPlateSizes(platePos)) {
489         rotatedRectangle(result, platePos, Scalar(0, 0, 255), 1);
490 
491         plate.setPlatePos(platePos);
492         plate.setPlateColor(the_color);
493         plate.setPlateLocateType(CMSER);
494 
495         if (the_color == BLUE) out_plateVec_blue.push_back(plate);
496         if (the_color == YELLOW) out_plateVec_yellow.push_back(plate);
497       }
498 
499       // use deskew to rotate the image, so we need the binary image.
500       if (1) {
501         for (auto mserChar : mserCharacter) {
502           Rect rect = mserChar.getCharacterPos();
503           match.at(color_index)(rect) = 255;
504         }
505         cv::line(match.at(color_index), rightPoint, leftPoint, Scalar(255));
506       }
507     }
508 
509     if (0 /*&& showDebug*/) {
510       imshow("result", result);
511       waitKey(0);
512       destroyWindow("result");
513     }
514 
515     if (0) {
516       imshow("match", match.at(color_index));
517       waitKey(0);
518       destroyWindow("match");
519     }
520 
521     if (0) {
522       std::stringstream ss(std::stringstream::in | std::stringstream::out);
523       ss << "resources/image/tmp/plateDetect/plate_" << img_index << "_" << the_color << ".jpg";
524       imwrite(ss.str(), result);
525     }
526   }
527 
528 
529 }

View Code

　　首先經過MSER提取區域，提取出的區域進行一個尺寸判斷，濾除明顯不符合車牌文字尺寸的。接下來使用一個文字分類器，將分類結果機率大於0.9的設爲強種子（下圖的綠色方框）。靠近的強種子進行聚合，劃出一條線穿過它們的中心（圖中白色的線）。通常來講，這條線就是車牌的中間軸線，斜率什麼都相同。以後，就在這條線的附近尋找那些機率低於0.9的弱種子（藍色方框）。因爲車牌的特徵，這些藍色方框應該跟綠色方框距離不太遠，同時尺寸也不會相差太大。藍色方框實在綠色方框的左右查找的，有時候，幾個綠色方框中間可能存在着一個方庫，這能夠經過每一個方框之間的距離差推出來，這就是橙色的方框。所有找完之後。綠色方框加上藍色與橙色方框的總數表明着目前在車牌區域中發現的文字數。有時這個數會低於7（中文車牌的文字數），這是由於有些區域即使經過MSER也提取不到（例如很是不穩定或光照變化大的），另外不少中文也沒法經過MSER提取到（中文大可能是不連通的，MSER提取的區域基本都是連通的）。因此下面須要再增長一個滑動窗口（紅色方框）來尋找這些缺失的文字或者中文，若是分類器機率大於某個閾值，就能夠將其加入到最終的結果中。最後，把全部文字的位置用一個方框框起來，就是車牌的區域。測試

　　想要經過中間圖片進行調試程序的話，首先依次根據函數調用關係plateMserLocate->mserSearch->mserCharMatch在core_func.cpp找到位置。在函數的最後，把圖片輸出的判斷符改成1。而後在resources/image下面依次新建tmp與plateDetect目錄（跟代碼中的一致），接下來再運行時在新目錄裏就能夠看到這些調試圖片。（EasyPR裏還有不少其餘相似的輸出代碼，只要按照代碼的寫法建立文件夾就能夠看到輸出結果了）。

圖5 文字定位的中間結果（調試圖像）

二. 更加合理準確的評價指標

　　原先的EasyPR的評價標準中有不少不合理的地方。例如一張圖片中找到了一個疑似的區域，就認爲是定位成功了。或者若是一張圖片中定位到了幾個車牌，就用差距率最小的那個做爲定位結果。這些地方不合理的地方在於，有可能找到的疑似區域根本不是車牌區域。另一個包含幾個車牌的圖片僅僅用最大的一個做爲結果，明顯不合理。

　　所以新評價指標須要考慮定位區域和車牌區域的位置差別，只有當二者接近時才能認爲是定位成功。另外，一張圖片若是有幾個車牌，對應的就有幾個定位區域，每一個區域與車牌作比對，綜合起來才能做爲定位效果。所以須要加入一個GroundTruth，標記各個車牌的位置信息。新版本中，咱們標記了251張圖片，其中共250個車牌的位置信息。爲了衡量定位區域與車牌區域的位置差的比例，又引入了ICDAR2003的評價協議，來最終計算出定位的recall，precise與fscore值。

　　車牌定位評價中作了大改動。字符識別模塊則作了小改動。首先是去除了「平均字符差距」這個意義較小的指標。轉而用零字符差距，一字符差距，中文字符正確替代，這三者都是比率。零字符差距（0-error）指的是識別結果與車牌沒有任何差別，跟原先的評價協議中的「徹底正確率」指代同樣。一字符差距（1-error）指的是錯別僅僅只有1個字符或如下的，包括零字符差距。注意，中文通常是兩個字符。中文字符正確（Chinese-precise）指代中文字符識別正確的比率。這三個指標，都是越大越好，100%最高。

　　爲了實際看出這些指標的效果，拿通用測試集裏增長的50張複雜圖片作對此測試，文字定位方法在這些數據上的表現的差別與原先的SOBEL，COLOR定位方法的區別能夠看下面的結果。

　　SOBEL+COLOR:
　　總圖片數:50, Plates count:52, 定位率:51.9231%
　　Recall:46.1696%, Precise:26.3273%, Fscore:33.533%.
　　0-error:12.5%, 1-error:12.5%, Chinese-precise:37.5%

　　CMSER:
　　總圖片數:50, Plates count:52, 定位率:78.8462%
　　Recall:70.6192%, Precise:70.1825%, Fscore:70.4002%.
　　0-error:59.4595%, 1-error:70.2703%, Chinese-precise:70.2703%

　　能夠看出定位率提高了接近27個百分點，定位Fscore與中文識別正確率則提高了接近1倍。

三. 非極大值抑制

　　新版本中另外一個較大的改動就是大量的使用了非極大值抑制(Non-maximum suppression)。使用非極大值抑制有幾個好處：1.當有幾個定位區域重疊時，能夠根據它們的置信度（也是SVM車牌判斷模型得出的值）來取出其中最大機率準確的一個，移除其餘幾個。這樣，不一樣定位方法，例如Sobel與Color定位的同一個區域，只有一個能夠保留。所以，EasyPR新版本中，最終定位出的一個車牌區域，再也不會有幾個框了。2.結合滑動窗口，能夠用其來準肯定位文字的位置，例如在車牌定位模塊中找到機率最大的文字位置，或者在文字識別模塊中，更準確的找到中文文字的位置。

　　非極大值抑制的使用使得EasyPR的定位方法與後面的識別模塊解耦了。之前，每增長定位方法，可能會對最終輸出產生影響。如今，不管多少定位方法定位出的車牌都會經過非極大值抑制取出最大機率的一個，對後面的方法沒有一點影響。

　　另外，現在setMaxPlates（）這個函數能夠確實的做用了。之前能夠設置，但沒效果。如今，設置這個值爲n之後，當在一副圖像中檢測到大於n個車牌區域（注意，這個是通過非極大值抑制後的）時，EasyPR只會輸出n個可能性最高的車牌區域。

四. 字符分割與識別部分的強化

　　新版本中字符分割與識別部分都添加了新算法。例如使用了spatial-ostu替代普通的ostu算法，增長了圖像分割在面對光照不均勻的圖像上的二值化效果。

圖6 車牌圖像（左），普通大津閾值結果（中），空間大津閾值結果（右）

　　同時，識別部分針對中文增長了一種adaptive threshold方法。這種方法在二值化「川」字時有比ostu更好的效果。經過將二者一併使用，並選擇其中字符識別機率最大的一個，顯著提高了中文字符的識別準確率。在識別中文時，增長了一個小型的滑動窗口，以此來彌補經過省份字符直接查找中文字符時的定位不精等現象。

五. 新的特徵與SVM模型，新的中文識別ANN模型

　　爲了強化車牌判斷的魯棒性，新版本中更改了SVM模型的特徵，使用LBP特徵的模型在面對低對比度與光照的車牌圖像中也有很好的判斷效果。爲了強化中文識別的準確率，如今單獨爲31類中文文字訓練了一個ANN模型ann_chinese，使用這個模型在分類中文是的效果，相對原先的通用模型能夠提高近10個百分點。

六. 其餘

　　幾天前EasyPR發佈了1.5-alpha版本。今天發佈的beta版本相對於alpha版本，增長了Grid Search功能, 對文字定位方法的參數又進行了部分調優，同時去除了一些中文註釋以提升window下的兼容性，除此以外，在速度方面，此版本首次使用了多線程編程技術（OpenMP）來提升算法總體的效率等，使得最終的速度有了2倍左右的提高。

　　下面說一點新版本的不足：目前來看，文字定位方法的魯棒性確實很高，不過遺憾的速度跟顏色定位方法相比，仍是慢了接近一倍（與Sobel定位效率至關）。後面的改善中，考慮對其進行優化。另外，字符分割的效果實際上仍是能夠有更多的優化算法選擇的，將來的版本能夠考慮對其作一個較大的嘗試與改進。

　　對EasyPR作下說明：EasyPR，一個開源的中文車牌識別系統，代碼託管在github和gitosc。其次，在前面的博客文章中，包含EasyPR至今的開發文檔與介紹。

版權說明：

　　本文中的全部文字，圖片，代碼的版權都是屬於做者和博客園共同全部。歡迎轉載，可是務必註明做者與出處。任何未經容許的剽竊以及爬蟲抓取都屬於侵權，做者和博客園保留全部權利。

參考文獻：

　　1.Character-MSER : Scene Text Detection with Robust Character Candidate Extraction Method, ICDAR2015

　　2.Seed-growing : A robust hierarchical detection method for scene text based on convolutional neural networks, ICME2015

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。