TextDetection文本檢測數據集彙總

字符識別和文本檢測在實際生活中十分重要,從最簡單的車牌檢測到複雜的環境文本識別都須要這一技術的支持。目前這一領域最著名的會議是International Conference on Document Analysis and Recognition(ICDAR)php

1.文字檢測與識別主要數據集

在這裏插入圖片描述


Total-Text

paperhtml


COCO-Text, COCO-Text V2

papergit


MSRA-TD500
在這裏插入圖片描述
ref papergithub


ICDAR2017, 競賽中包含了多個領域的數據集。web

Category: Handwritten Historical Document Layout Recognition

cBAD: ICDAR2017 Competition on Baseline Detection ICDAR2017 Competition on Layout Analysis for Challenging Medieval Manuscripts ICDAR2017 Competition on Historical Book Analysis
Category: Historical Handwritten Script Analysis

ICDAR 2017 Competition on the Classification of Medieval Handwritings in Latin Script ICDAR2017 Competition on Historical Document Writer Identification (Historical-WI) Competition on Multi-script Writer Identification Using LAMIS-MSHD and CERUG Databases
Category: Character/Word Spotting

Competition on Query-by-Example Glyph Spotting of Southeast Asian Palm Leaf Manuscript Images Handwritten Keyword Spotting Competition
Category: Handwriting Recognition

ICDAR2017 Competition on Handwritten Text Recognition on the READ Dataset ICDAR2017 Competition on Information Extraction in Historical Handwritten Records
Category: Document Image Binarization

ICDAR2017 Competition on Document Image Binarization (DIBCO 2017)
Category: Document Recognition (Layout analysis & Text Recognition)

ICDAR2017 Competition on Recognition of Documents with Complex Layouts – RDCL2017 ICDAR2017 Competition on Recognition of Early Indian Printed Documents – REID2017 ICDAR2017 Competition on Page Object Detection
Category: Document Reconstruction

Smartphone-captured Document Image Reconstruction from Multiple Views
Category: Post OCR Correction

ICDAR2017 Competition on Post-OCR Text Correction
Category: Robust Reading Competitions

ICDAR2017 Competition on Reading Chinese Text in the Wild (RCTW-17) paper:https://arxiv.org/pdf/1708.09585.pdf ICDAR2017 Robust Reading Challenge on COCO-Text
Category: Text in Video

ICDAR2017 Competition on Arabic Text Detection and Recognition in Multi-resolution Video Frames Competition on Video Script Identification
Category: Forensics
Competition on File Type Identification
Category: Miscellaneous Competitions

ICDAR2017 Competition on Multi-font and Multi-Size Digitally Represented Arabic Text ref:http://mac.xmu.edu.cn/valse2017/ppt/Invited/VALSE2017_bx.pdf

ICDAR2015
場景文字識別
生成數字圖像文字識別
還包含了一個文本超分辨數據集
opencv中的一個接口ide


ICDAR2013
Robust Reading:http://refbase.cvc.uab.es/files/KSU2013.pdf
中文手寫數據集, 下載
ref:https://www.computer.org/csdl/proceedings/icdar/2013/4999/00/06628568.pdf
數字文件researcher:https://roundtrippdf.com/en/post

2.一些最近發表的工做(from total-text)

Detection

MSRFTSN, TextSnake, TextField , Mask TextSpotter , TextNet, Textboxes, EAST, Baseline, SegLink網站

End-to-end Recognition

TextNet, Mask TextSpotter, Textboxesui


此外還有下面一些和數字字符識別相關的數據集:
手寫字符識別:MNISTgoogle


街道門牌號數據集:SVHN
在這裏插入圖片描述


一些相關網站,能夠找到更多數據集:
國際模式識別協會會第十一技術組(OCR):IAPR-TC11TC11 datasets
圖像識別TC10工做組, TC10 datasets
ICDAR 2017彙總:https://github.com/cs-chan/Total-Text-Dataset
近年來Robust Reading競賽彙總網站:http://rrc.cvc.uab.es/
研究導航:http://www.guide2research.com/conference/icdar-2019

在這裏插入圖片描述
pic from pexels.com

相關文章
相關標籤/搜索