今天咱們來介紹車牌定位中的一種新方法--文字定位方法(MSER),包括其主要設計思想與實現。接着咱們會介紹一下EasyPR v1.5-beta版本中帶來的幾項改動。html
一. 文字定位法git
在EasyPR前面幾個版本中,最爲人所詬病的就是定位效果不佳,尤爲是在面對生活場景(例如手機拍攝)時。因爲EasyPR最先的數據來源於卡口,所以對卡口數據進行了優化,而並無對生活場景中圖片有較好處理的策略。後來一個版本(v1.3)增長了顏色定位方法,改善了這種現象,可是對分辨率較大的圖片處理仍然很差。再加上顏色定位在面對低光照,低對比度的圖像時處理效果大幅度降低,顏色自己也是一個不穩定的特徵。所以EasyPR的車牌定位的總體魯棒性仍然不足。github
針對這種現象,EasyPR v1.5增長了一種新的定位方法,文字定位方法,大幅度改善了這些問題。下面幾幅圖能夠說明文字定位法的效果。算法
圖1 夜間的車牌圖像(左) , 圖2 對比度很是低的圖像(右)編程
圖3 近距離的圖像(左) , 圖4 高分辨率的圖像(右)多線程
圖1是夜間的車牌圖像,圖2是對比度很是低的圖像,圖3是很是近距離拍攝的圖像,圖4則是高分辨率(3200寬)的圖像。less
文字定位方法是採用了低級過濾器提取文字,而後再將其組合的一種定位方法。原先是利用在場景中定位文字,在這裏利用其定位車牌。與在掃描文檔中的文字不一樣,天然場景中的文字具備低對比度,背景各異,光亮干擾較多等狀況,所以須要一個極爲魯棒的方法去提取出來。目前業界用的較多的是MSER(最大穩定極值區域)方法。EasyPR使用的是MSER的一個改良方法,專門針對文字進行了優化。在文字定位出來之後,通常須要用一個分類器將其中大部分的定位錯誤的文字去掉,例如ANN模型。爲了得到最終的車牌,這些文字須要組合起來。因爲實際狀況的複雜,簡單的使用普通的聚類效果每每很差,所以EasyPR使用了一種魯棒性較強的種子生長方法(seed growing)去組合。ide
我在這裏簡單介紹一下具體的實現。關於方法的細節能夠看代碼,有不少的註釋(代碼可能較長)。關於方法的思想能夠看附錄的兩篇論文。函數
1 //! use verify size to first generate char candidates 2 void mserCharMatch(const Mat &src, std::vector<Mat> &match, std::vector<CPlate>& out_plateVec_blue, std::vector<CPlate>& out_plateVec_yellow, 3 bool usePlateMser, std::vector<RotatedRect>& out_plateRRect_blue, std::vector<RotatedRect>& out_plateRRect_yellow, int img_index, 4 bool showDebug) { 5 Mat image = src; 6 7 std::vector<std::vector<std::vector<Point>>> all_contours; 8 std::vector<std::vector<Rect>> all_boxes; 9 all_contours.resize(2); 10 all_contours.at(0).reserve(1024); 11 all_contours.at(1).reserve(1024); 12 all_boxes.resize(2); 13 all_boxes.at(0).reserve(1024); 14 all_boxes.at(1).reserve(1024); 15 16 match.resize(2); 17 18 std::vector<Color> flags; 19 flags.push_back(BLUE); 20 flags.push_back(YELLOW); 21 22 const int imageArea = image.rows * image.cols; 23 const int delta = 1; 24 //const int delta = CParams::instance()->getParam2i();; 25 const int minArea = 30; 26 const double maxAreaRatio = 0.05; 27 28 Ptr<MSER2> mser; 29 mser = MSER2::create(delta, minArea, int(maxAreaRatio * imageArea)); 30 mser->detectRegions(image, all_contours.at(0), all_boxes.at(0), all_contours.at(1), all_boxes.at(1)); 31 32 // mser detect 33 // color_index = 0 : mser-, detect white characters, which is in blue plate. 34 // color_index = 1 : mser+, detect dark characters, which is in yellow plate. 35 36 #pragma omp parallel for 37 for (int color_index = 0; color_index < 2; color_index++) { 38 Color the_color = flags.at(color_index); 39 40 std::vector<CCharacter> charVec; 41 charVec.reserve(128); 42 43 match.at(color_index) = Mat::zeros(image.rows, image.cols, image.type()); 44 45 Mat result = image.clone(); 46 cvtColor(result, result, COLOR_GRAY2BGR); 47 48 size_t size = all_contours.at(color_index).size(); 49 50 int char_index = 0; 51 int char_size = 20; 52 53 // Chinese plate has max 7 characters. 54 const int char_max_count = 7; 55 56 // verify char size and output to rects; 57 for (size_t index = 0; index < size; index++) { 58 Rect rect = all_boxes.at(color_index)[index]; 59 std::vector<Point>& contour = all_contours.at(color_index)[index]; 60 61 // sometimes a plate could be a mser rect, so we could 62 // also use mser algorithm to find plate 63 if (usePlateMser) { 64 RotatedRect rrect = minAreaRect(Mat(contour)); 65 if (verifyRotatedPlateSizes(rrect)) { 66 //rotatedRectangle(result, rrect, Scalar(255, 0, 0), 2); 67 if (the_color == BLUE) out_plateRRect_blue.push_back(rrect); 68 if (the_color == YELLOW) out_plateRRect_yellow.push_back(rrect); 69 } 70 } 71 72 // find character 73 if (verifyCharSizes(rect)) { 74 Mat mserMat = adaptive_image_from_points(contour, rect, Size(char_size, char_size)); 75 Mat charInput = preprocessChar(mserMat, char_size); 76 Rect charRect = rect; 77 78 Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2); 79 Mat tmpMat; 80 double ostu_level = cv::threshold(image(charRect), tmpMat, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU); 81 82 //cv::circle(result, center, 3, Scalar(0, 0, 255), 2); 83 84 // use judegMDOratio2 function to 85 // remove the small lines in character like "zh-cuan" 86 if (judegMDOratio2(image, rect, contour, result)) { 87 CCharacter charCandidate; 88 charCandidate.setCharacterPos(charRect); 89 charCandidate.setCharacterMat(charInput); 90 charCandidate.setOstuLevel(ostu_level); 91 charCandidate.setCenterPoint(center); 92 charCandidate.setIsChinese(false); 93 charVec.push_back(charCandidate); 94 } 95 } 96 } 97 98 // improtant, use matrix multiplication to acclerate the 99 // classification of many samples. use the character 100 // score, we can use non-maximum superssion (nms) to 101 // reduce the characters which are not likely to be true 102 // charaters, and use the score to select the strong seed 103 // of which the score is larger than 0.9 104 CharsIdentify::instance()->classify(charVec); 105 106 // use nms to remove the character are not likely to be true. 107 double overlapThresh = 0.6; 108 //double overlapThresh = CParams::instance()->getParam1f(); 109 NMStoCharacter(charVec, overlapThresh); 110 charVec.shrink_to_fit(); 111 112 std::vector<CCharacter> strongSeedVec; 113 strongSeedVec.reserve(64); 114 std::vector<CCharacter> weakSeedVec; 115 weakSeedVec.reserve(64); 116 std::vector<CCharacter> littleSeedVec; 117 littleSeedVec.reserve(64); 118 119 //size_t charCan_size = charVec.size(); 120 for (auto charCandidate : charVec) { 121 //CCharacter& charCandidate = charVec[char_index]; 122 Rect rect = charCandidate.getCharacterPos(); 123 double score = charCandidate.getCharacterScore(); 124 if (charCandidate.getIsStrong()) { 125 strongSeedVec.push_back(charCandidate); 126 } 127 else if (charCandidate.getIsWeak()) { 128 weakSeedVec.push_back(charCandidate); 129 //cv::rectangle(result, rect, Scalar(255, 0, 255)); 130 } 131 else if (charCandidate.getIsLittle()) { 132 littleSeedVec.push_back(charCandidate); 133 //cv::rectangle(result, rect, Scalar(255, 0, 255)); 134 } 135 } 136 137 std::vector<CCharacter> searchCandidate = charVec; 138 139 // nms to srong seed, only leave the strongest one 140 overlapThresh = 0.3; 141 NMStoCharacter(strongSeedVec, overlapThresh); 142 143 // merge chars to group 144 std::vector<std::vector<CCharacter>> charGroupVec; 145 charGroupVec.reserve(64); 146 mergeCharToGroup(strongSeedVec, charGroupVec); 147 148 // genenrate the line of the group 149 // based on the assumptions , the mser rects which are 150 // given high socre by character classifier could be no doubtly 151 // be the characters in one plate, and we can use these characeters 152 // to fit a line which is the middle line of the plate. 153 std::vector<CPlate> plateVec; 154 plateVec.reserve(16); 155 for (auto charGroup : charGroupVec) { 156 Rect plateResult = charGroup[0].getCharacterPos(); 157 std::vector<Point> points; 158 points.reserve(32); 159 160 Vec4f line; 161 int maxarea = 0; 162 Rect maxrect; 163 double ostu_level_sum = 0; 164 165 int leftx = image.cols; 166 Point leftPoint(leftx, 0); 167 int rightx = 0; 168 Point rightPoint(rightx, 0); 169 170 std::vector<CCharacter> mserCharVec; 171 mserCharVec.reserve(32); 172 173 // remove outlier CharGroup 174 std::vector<CCharacter> roCharGroup; 175 roCharGroup.reserve(32); 176 177 removeRightOutliers(charGroup, roCharGroup, 0.2, 0.5, result); 178 //roCharGroup = charGroup; 179 180 for (auto character : roCharGroup) { 181 Rect charRect = character.getCharacterPos(); 182 cv::rectangle(result, charRect, Scalar(0, 255, 0), 1); 183 plateResult |= charRect; 184 185 Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2); 186 points.push_back(center); 187 mserCharVec.push_back(character); 188 //cv::circle(result, center, 3, Scalar(0, 255, 0), 2); 189 190 ostu_level_sum += character.getOstuLevel(); 191 192 if (charRect.area() > maxarea) { 193 maxrect = charRect; 194 maxarea = charRect.area(); 195 } 196 if (center.x < leftPoint.x) { 197 leftPoint = center; 198 } 199 if (center.x > rightPoint.x) { 200 rightPoint = center; 201 } 202 } 203 204 double ostu_level_avg = ostu_level_sum / (double)roCharGroup.size(); 205 if (1 && showDebug) { 206 std::cout << "ostu_level_avg:" << ostu_level_avg << std::endl; 207 } 208 float ratio_maxrect = (float)maxrect.width / (float)maxrect.height; 209 210 if (points.size() >= 2 && ratio_maxrect >= 0.3) { 211 fitLine(Mat(points), line, CV_DIST_L2, 0, 0.01, 0.01); 212 213 float k = line[1] / line[0]; 214 //float angle = atan(k) * 180 / (float)CV_PI; 215 //std::cout << "k:" << k << std::endl; 216 //std::cout << "angle:" << angle << std::endl; 217 //std::cout << "cos:" << 0.3 * cos(k) << std::endl; 218 //std::cout << "ratio_maxrect:" << ratio_maxrect << std::endl; 219 220 std::sort(mserCharVec.begin(), mserCharVec.end(), 221 [](const CCharacter& r1, const CCharacter& r2) { 222 return r1.getCharacterPos().tl().x < r2.getCharacterPos().tl().x; 223 }); 224 225 CCharacter midChar = mserCharVec.at(int(mserCharVec.size() / 2.f)); 226 Rect midRect = midChar.getCharacterPos(); 227 Point midCenter(midRect.tl().x + midRect.width / 2, midRect.tl().y + midRect.height / 2); 228 229 int mindist = 7 * maxrect.width; 230 std::vector<Vec2i> distVecVec; 231 distVecVec.reserve(32); 232 233 Vec2i mindistVec; 234 Vec2i avgdistVec; 235 236 // computer the dist which is the distacne between 237 // two near characters in the plate, use dist we can 238 // judege how to computer the max search range, and choose the 239 // best location of the sliding window in the next steps. 240 for (size_t mser_i = 0; mser_i + 1 < mserCharVec.size(); mser_i++) { 241 Rect charRect = mserCharVec.at(mser_i).getCharacterPos(); 242 Point center(charRect.tl().x + charRect.width / 2, charRect.tl().y + charRect.height / 2); 243 244 Rect charRectCompare = mserCharVec.at(mser_i + 1).getCharacterPos(); 245 Point centerCompare(charRectCompare.tl().x + charRectCompare.width / 2, 246 charRectCompare.tl().y + charRectCompare.height / 2); 247 248 int dist = charRectCompare.x - charRect.x; 249 Vec2i distVec(charRectCompare.x - charRect.x, charRectCompare.y - charRect.y); 250 distVecVec.push_back(distVec); 251 252 //if (dist < mindist) { 253 // mindist = dist; 254 // mindistVec = distVec; 255 //} 256 } 257 258 std::sort(distVecVec.begin(), distVecVec.end(), 259 [](const Vec2i& r1, const Vec2i& r2) { 260 return r1[0] < r2[0]; 261 }); 262 263 avgdistVec = distVecVec.at(int((distVecVec.size() - 1) / 2.f)); 264 265 //float step = 10.f * (float)maxrect.width; 266 //float step = (float)mindistVec[0]; 267 float step = (float)avgdistVec[0]; 268 269 //cv::line(result, Point2f(line[2] - step, line[3] - k*step), Point2f(line[2] + step, k*step + line[3]), Scalar(255, 255, 255)); 270 cv::line(result, Point2f(midCenter.x - step, midCenter.y - k*step), Point2f(midCenter.x + step, k*step + midCenter.y), Scalar(255, 255, 255)); 271 //cv::circle(result, leftPoint, 3, Scalar(0, 0, 255), 2); 272 273 CPlate plate; 274 plate.setPlateLeftPoint(leftPoint); 275 plate.setPlateRightPoint(rightPoint); 276 277 plate.setPlateLine(line); 278 plate.setPlatDistVec(avgdistVec); 279 plate.setOstuLevel(ostu_level_avg); 280 281 plate.setPlateMergeCharRect(plateResult); 282 plate.setPlateMaxCharRect(maxrect); 283 plate.setMserCharacter(mserCharVec); 284 plateVec.push_back(plate); 285 } 286 } 287 288 // use strong seed to construct the first shape of the plate, 289 // then we need to find characters which are the weak seed. 290 // because we use strong seed to build the middle lines of the plate, 291 // we can simply use this to consider weak seeds only lie in the 292 // near place of the middle line 293 for (auto plate : plateVec) { 294 Vec4f line = plate.getPlateLine(); 295 Point leftPoint = plate.getPlateLeftPoint(); 296 Point rightPoint = plate.getPlateRightPoint(); 297 298 Rect plateResult = plate.getPlateMergeCharRect(); 299 Rect maxrect = plate.getPlateMaxCharRect(); 300 Vec2i dist = plate.getPlateDistVec(); 301 double ostu_level = plate.getOstuLevel(); 302 303 std::vector<CCharacter> mserCharacter = plate.getCopyOfMserCharacters(); 304 mserCharacter.reserve(16); 305 306 float k = line[1] / line[0]; 307 float x_1 = line[2]; 308 float y_1 = line[3]; 309 310 std::vector<CCharacter> searchWeakSeedVec; 311 searchWeakSeedVec.reserve(16); 312 313 std::vector<CCharacter> searchRightWeakSeed; 314 searchRightWeakSeed.reserve(8); 315 std::vector<CCharacter> searchLeftWeakSeed; 316 searchLeftWeakSeed.reserve(8); 317 318 std::vector<CCharacter> slideRightWindow; 319 slideRightWindow.reserve(8); 320 std::vector<CCharacter> slideLeftWindow; 321 slideLeftWindow.reserve(8); 322 323 // draw weak seed and little seed from line; 324 // search for mser rect 325 if (1 && showDebug) { 326 std::cout << "search for mser rect:" << std::endl; 327 } 328 329 if (0 && showDebug) { 330 std::stringstream ss(std::stringstream::in | std::stringstream::out); 331 ss << "resources/image/tmp/" << img_index << "_1_" << "searcgMserRect.jpg"; 332 imwrite(ss.str(), result); 333 } 334 if (1 && showDebug) { 335 std::cout << "mserCharacter:" << mserCharacter.size() << std::endl; 336 } 337 338 // if the count of strong seed is larger than max count, we dont need 339 // all the next steps, if not, we first need to search the weak seed in 340 // the same line as the strong seed. The judge condition contains the distance 341 // between strong seed and weak seed , and the rect simily of each other to improve 342 // the roubustnedd of the seed growing algorithm. 343 if (mserCharacter.size() < char_max_count) { 344 double thresh1 = 0.15; 345 double thresh2 = 2.0; 346 searchWeakSeed(searchCandidate, searchRightWeakSeed, thresh1, thresh2, line, rightPoint, 347 maxrect, plateResult, result, CharSearchDirection::RIGHT); 348 if (1 && showDebug) { 349 std::cout << "searchRightWeakSeed:" << searchRightWeakSeed.size() << std::endl; 350 } 351 for (auto seed : searchRightWeakSeed) { 352 cv::rectangle(result, seed.getCharacterPos(), Scalar(255, 0, 0), 1); 353 mserCharacter.push_back(seed); 354 } 355 356 searchWeakSeed(searchCandidate, searchLeftWeakSeed, thresh1, thresh2, line, leftPoint, 357 maxrect, plateResult, result, CharSearchDirection::LEFT); 358 if (1 && showDebug) { 359 std::cout << "searchLeftWeakSeed:" << searchLeftWeakSeed.size() << std::endl; 360 } 361 for (auto seed : searchLeftWeakSeed) { 362 cv::rectangle(result, seed.getCharacterPos(), Scalar(255, 0, 0), 1); 363 mserCharacter.push_back(seed); 364 } 365 } 366 367 // sometimes the weak seed is in the middle of the strong seed. 368 // and sometimes two strong seed are actually the two parts of one character. 369 // because we only consider the weak seed in the left and right direction of strong seed. 370 // now we examine all the strong seed and weak seed. not only to find the seed in the middle, 371 // but also to combine two seed which are parts of one character to one seed. 372 // only by this process, we could use the seed count as the condition to judge if or not to use slide window. 373 float min_thresh = 0.3f; 374 float max_thresh = 2.5f; 375 reFoundAndCombineRect(mserCharacter, min_thresh, max_thresh, dist, maxrect, result); 376 377 // if the characters count is less than max count 378 // this means the mser rect in the lines are not enough. 379 // sometimes there are still some characters could not be captured by mser algorithm, 380 // such as blur, low light ,and some chinese characters like zh-cuan. 381 // to handle this ,we use a simple slide window method to find them. 382 if (mserCharacter.size() < char_max_count) { 383 if (1 && showDebug) { 384 std::cout << "search chinese:" << std::endl; 385 std::cout << "judege the left is chinese:" << std::endl; 386 } 387 388 // if the left most character is chinese, this means 389 // that must be the first character in chinese plate, 390 // and we need not to do a slide window to left. So, 391 // the first thing is to judge the left charcater is 392 // or not the chinese. 393 bool leftIsChinese = false; 394 if (1) { 395 std::sort(mserCharacter.begin(), mserCharacter.end(), 396 [](const CCharacter& r1, const CCharacter& r2) { 397 return r1.getCharacterPos().tl().x < r2.getCharacterPos().tl().x; 398 }); 399 400 CCharacter leftChar = mserCharacter[0]; 401 402 //Rect theRect = adaptive_charrect_from_rect(leftChar.getCharacterPos(), image.cols, image.rows); 403 Rect theRect = leftChar.getCharacterPos(); 404 //cv::rectangle(result, theRect, Scalar(255, 0, 0), 1); 405 406 Mat region = image(theRect); 407 Mat binary_region; 408 409 ostu_level = cv::threshold(region, binary_region, 0, 255, CV_THRESH_BINARY | CV_THRESH_OTSU); 410 if (1 && showDebug) { 411 std::cout << "left : ostu_level:" << ostu_level << std::endl; 412 } 413 //plate.setOstuLevel(ostu_level); 414 415 Mat charInput = preprocessChar(binary_region, char_size); 416 if (0 /*&& showDebug*/) { 417 imshow("charInput", charInput); 418 waitKey(0); 419 destroyWindow("charInput"); 420 } 421 422 std::string label = ""; 423 float maxVal = -2.f; 424 leftIsChinese = CharsIdentify::instance()->isCharacter(charInput, label, maxVal, true); 425 //auto character = CharsIdentify::instance()->identifyChinese(charInput, maxVal, leftIsChinese); 426 //label = character.second; 427 if (0 /* && showDebug*/) { 428 std::cout << "isChinese:" << leftIsChinese << std::endl; 429 std::cout << "chinese:" << label; 430 std::cout << "__score:" << maxVal << std::endl; 431 } 432 } 433 434 // if the left most character is not a chinese, 435 // this means we meed to slide a window to find the missed mser rect. 436 // search for sliding window 437 float ratioWindow = 0.4f; 438 //float ratioWindow = CParams::instance()->getParam3f(); 439 float threshIsCharacter = 0.8f; 440 //float threshIsCharacter = CParams::instance()->getParam3f(); 441 if (!leftIsChinese) { 442 slideWindowSearch(image, slideLeftWindow, line, leftPoint, dist, ostu_level, ratioWindow, threshIsCharacter, 443 maxrect, plateResult, CharSearchDirection::LEFT, true, result); 444 if (1 && showDebug) { 445 std::cout << "slideLeftWindow:" << slideLeftWindow.size() << std::endl; 446 } 447 for (auto window : slideLeftWindow) { 448 cv::rectangle(result, window.getCharacterPos(), Scalar(0, 0, 255), 1); 449 mserCharacter.push_back(window); 450 } 451 } 452 } 453 454 // if we still have less than max count characters, 455 // we need to slide a window to right to search for the missed mser rect. 456 if (mserCharacter.size() < char_max_count) { 457 // change ostu_level 458 float ratioWindow = 0.4f; 459 //float ratioWindow = CParams::instance()->getParam3f(); 460 float threshIsCharacter = 0.8f; 461 //float threshIsCharacter = CParams::instance()->getParam3f(); 462 slideWindowSearch(image, slideRightWindow, line, rightPoint, dist, plate.getOstuLevel(), ratioWindow, threshIsCharacter, 463 maxrect, plateResult, CharSearchDirection::RIGHT, false, result); 464 if (1 && showDebug) { 465 std::cout << "slideRightWindow:" << slideRightWindow.size() << std::endl; 466 } 467 for (auto window : slideRightWindow) { 468 cv::rectangle(result, window.getCharacterPos(), Scalar(0, 0, 255), 1); 469 mserCharacter.push_back(window); 470 } 471 } 472 473 // computer the plate angle 474 float angle = atan(k) * 180 / (float)CV_PI; 475 if (1 && showDebug) { 476 std::cout << "k:" << k << std::endl; 477 std::cout << "angle:" << angle << std::endl; 478 } 479 480 // the plateResult rect need to be enlarge to contains all the plate, 481 // not only the character area. 482 float widthEnlargeRatio = 1.15f; 483 float heightEnlargeRatio = 1.25f; 484 RotatedRect platePos(Point2f((float)plateResult.x + plateResult.width / 2.f, (float)plateResult.y + plateResult.height / 2.f), 485 Size2f(plateResult.width * widthEnlargeRatio, maxrect.height * heightEnlargeRatio), angle); 486 487 // justify the size is likely to be a plate size. 488 if (verifyRotatedPlateSizes(platePos)) { 489 rotatedRectangle(result, platePos, Scalar(0, 0, 255), 1); 490 491 plate.setPlatePos(platePos); 492 plate.setPlateColor(the_color); 493 plate.setPlateLocateType(CMSER); 494 495 if (the_color == BLUE) out_plateVec_blue.push_back(plate); 496 if (the_color == YELLOW) out_plateVec_yellow.push_back(plate); 497 } 498 499 // use deskew to rotate the image, so we need the binary image. 500 if (1) { 501 for (auto mserChar : mserCharacter) { 502 Rect rect = mserChar.getCharacterPos(); 503 match.at(color_index)(rect) = 255; 504 } 505 cv::line(match.at(color_index), rightPoint, leftPoint, Scalar(255)); 506 } 507 } 508 509 if (0 /*&& showDebug*/) { 510 imshow("result", result); 511 waitKey(0); 512 destroyWindow("result"); 513 } 514 515 if (0) { 516 imshow("match", match.at(color_index)); 517 waitKey(0); 518 destroyWindow("match"); 519 } 520 521 if (0) { 522 std::stringstream ss(std::stringstream::in | std::stringstream::out); 523 ss << "resources/image/tmp/plateDetect/plate_" << img_index << "_" << the_color << ".jpg"; 524 imwrite(ss.str(), result); 525 } 526 } 527 528 529 }
首先經過MSER提取區域,提取出的區域進行一個尺寸判斷,濾除明顯不符合車牌文字尺寸的。接下來使用一個文字分類器,將分類結果機率大於0.9的設爲強種子(下圖的綠色方框)。靠近的強種子進行聚合,劃出一條線穿過它們的中心(圖中白色的線)。通常來講,這條線就是車牌的中間軸線,斜率什麼都相同。以後,就在這條線的附近尋找那些機率低於0.9的弱種子(藍色方框)。因爲車牌的特徵,這些藍色方框應該跟綠色方框距離不太遠,同時尺寸也不會相差太大。藍色方框實在綠色方框的左右查找的,有時候,幾個綠色方框中間可能存在着一個方庫,這能夠經過每一個方框之間的距離差推出來,這就是橙色的方框。所有找完之後。綠色方框加上藍色與橙色方框的總數表明着目前在車牌區域中發現的文字數。有時這個數會低於7(中文車牌的文字數),這是由於有些區域即使經過MSER也提取不到(例如很是不穩定或光照變化大的),另外不少中文也沒法經過MSER提取到(中文大可能是不連通的,MSER提取的區域基本都是連通的)。因此下面須要再增長一個滑動窗口(紅色方框)來尋找這些缺失的文字或者中文,若是分類器機率大於某個閾值,就能夠將其加入到最終的結果中。最後,把全部文字的位置用一個方框框起來,就是車牌的區域。測試
想要經過中間圖片進行調試程序的話,首先依次根據函數調用關係plateMserLocate->mserSearch->mserCharMatch在core_func.cpp找到位置。在函數的最後,把圖片輸出的判斷符改成1。而後在resources/image下面依次新建tmp與plateDetect目錄(跟代碼中的一致),接下來再運行時在新目錄裏就能夠看到這些調試圖片。(EasyPR裏還有不少其餘相似的輸出代碼,只要按照代碼的寫法建立文件夾就能夠看到輸出結果了)。
圖5 文字定位的中間結果(調試圖像)
二. 更加合理準確的評價指標
原先的EasyPR的評價標準中有不少不合理的地方。例如一張圖片中找到了一個疑似的區域,就認爲是定位成功了。或者若是一張圖片中定位到了幾個車牌,就用差距率最小的那個做爲定位結果。這些地方不合理的地方在於,有可能找到的疑似區域根本不是車牌區域。另一個包含幾個車牌的圖片僅僅用最大的一個做爲結果,明顯不合理。
所以新評價指標須要考慮定位區域和車牌區域的位置差別,只有當二者接近時才能認爲是定位成功。另外,一張圖片若是有幾個車牌,對應的就有幾個定位區域,每一個區域與車牌作比對,綜合起來才能做爲定位效果。所以須要加入一個GroundTruth,標記各個車牌的位置信息。新版本中,咱們標記了251張圖片,其中共250個車牌的位置信息。爲了衡量定位區域與車牌區域的位置差的比例,又引入了ICDAR2003的評價協議,來最終計算出定位的recall,precise與fscore值。
車牌定位評價中作了大改動。字符識別模塊則作了小改動。首先是去除了「平均字符差距」這個意義較小的指標。轉而用零字符差距,一字符差距,中文字符正確替代,這三者都是比率。零字符差距(0-error)指的是識別結果與車牌沒有任何差別,跟原先的評價協議中的「徹底正確率」指代同樣。一字符差距(1-error)指的是錯別僅僅只有1個字符或如下的,包括零字符差距。注意,中文通常是兩個字符。中文字符正確(Chinese-precise)指代中文字符識別正確的比率。這三個指標,都是越大越好,100%最高。
爲了實際看出這些指標的效果,拿通用測試集裏增長的50張複雜圖片作對此測試,文字定位方法在這些數據上的表現的差別與原先的SOBEL,COLOR定位方法的區別能夠看下面的結果。
SOBEL+COLOR:
總圖片數:50, Plates count:52, 定位率:51.9231%
Recall:46.1696%, Precise:26.3273%, Fscore:33.533%.
0-error:12.5%, 1-error:12.5%, Chinese-precise:37.5%
CMSER:
總圖片數:50, Plates count:52, 定位率:78.8462%
Recall:70.6192%, Precise:70.1825%, Fscore:70.4002%.
0-error:59.4595%, 1-error:70.2703%, Chinese-precise:70.2703%
能夠看出定位率提高了接近27個百分點,定位Fscore與中文識別正確率則提高了接近1倍。
三. 非極大值抑制
新版本中另外一個較大的改動就是大量的使用了非極大值抑制(Non-maximum suppression)。使用非極大值抑制有幾個好處:1.當有幾個定位區域重疊時,能夠根據它們的置信度(也是SVM車牌判斷模型得出的值)來取出其中最大機率準確的一個,移除其餘幾個。這樣,不一樣定位方法,例如Sobel與Color定位的同一個區域,只有一個能夠保留。所以,EasyPR新版本中,最終定位出的一個車牌區域,再也不會有幾個框了。2.結合滑動窗口,能夠用其來準肯定位文字的位置,例如在車牌定位模塊中找到機率最大的文字位置,或者在文字識別模塊中,更準確的找到中文文字的位置。
非極大值抑制的使用使得EasyPR的定位方法與後面的識別模塊解耦了。之前,每增長定位方法,可能會對最終輸出產生影響。如今,不管多少定位方法定位出的車牌都會經過非極大值抑制取出最大機率的一個,對後面的方法沒有一點影響。
另外,現在setMaxPlates()這個函數能夠確實的做用了。之前能夠設置,但沒效果。如今,設置這個值爲n之後,當在一副圖像中檢測到大於n個車牌區域(注意,這個是通過非極大值抑制後的)時,EasyPR只會輸出n個可能性最高的車牌區域。
四. 字符分割與識別部分的強化
新版本中字符分割與識別部分都添加了新算法。例如使用了spatial-ostu替代普通的ostu算法,增長了圖像分割在面對光照不均勻的圖像上的二值化效果。
圖6 車牌圖像(左),普通大津閾值結果(中),空間大津閾值結果(右)
同時,識別部分針對中文增長了一種adaptive threshold方法。這種方法在二值化「川」字時有比ostu更好的效果。經過將二者一併使用,並選擇其中字符識別機率最大的一個,顯著提高了中文字符的識別準確率。在識別中文時,增長了一個小型的滑動窗口,以此來彌補經過省份字符直接查找中文字符時的定位不精等現象。
五. 新的特徵與SVM模型,新的中文識別ANN模型
爲了強化車牌判斷的魯棒性,新版本中更改了SVM模型的特徵,使用LBP特徵的模型在面對低對比度與光照的車牌圖像中也有很好的判斷效果。爲了強化中文識別的準確率,如今單獨爲31類中文文字訓練了一個ANN模型ann_chinese,使用這個模型在分類中文是的效果,相對原先的通用模型能夠提高近10個百分點。
六. 其餘
幾天前EasyPR發佈了1.5-alpha版本。今天發佈的beta版本相對於alpha版本,增長了Grid Search功能, 對文字定位方法的參數又進行了部分調優,同時去除了一些中文註釋以提升window下的兼容性,除此以外,在速度方面,此版本首次使用了多線程編程技術(OpenMP)來提升算法總體的效率等,使得最終的速度有了2倍左右的提高。
下面說一點新版本的不足:目前來看,文字定位方法的魯棒性確實很高,不過遺憾的速度跟顏色定位方法相比,仍是慢了接近一倍(與Sobel定位效率至關)。後面的改善中,考慮對其進行優化。另外,字符分割的效果實際上仍是能夠有更多的優化算法選擇的,將來的版本能夠考慮對其作一個較大的嘗試與改進。
對EasyPR作下說明:EasyPR,一個開源的中文車牌識別系統,代碼託管在github和gitosc。其次,在前面的博客文章中,包含EasyPR至今的開發文檔與介紹。
版權說明:
本文中的全部文字,圖片,代碼的版權都是屬於做者和博客園共同全部。歡迎轉載,可是務必註明做者與出處。任何未經容許的剽竊以及爬蟲抓取都屬於侵權,做者和博客園保留全部權利。
參考文獻:
1.Character-MSER : Scene Text Detection with Robust Character Candidate Extraction Method, ICDAR2015
2.Seed-growing : A robust hierarchical detection method for scene text based on convolutional neural networks, ICME2015