atitit。ocr框架類庫大全 attilax總結

 

 

atititocr框架類庫大全 attilax總結html

 

Tesseract

Asprise JavaOCR

 

 

閒來無事,發現百度有一個OCR文字識別接口,感受挺有意思的,拿來研究一下。       java

百度服務簡介:文字識別是百度天然場景OCR服務,依託百度業界領先的OCR算法,提供了整圖文字檢測、識別、整圖文字識別、整圖文字行定位和單字圖像識別等功能。算法

很少說啦,直接看demo吧!windows

 

 

java4less

The J4L OCR tools is set of components that can be used to include OCR capabilities in Java applications. That means you can receive faxes, PDF files or scan documents and extract business information from the images. The main 3 components are:app

a Java wrapper for the Tesseract OCR engine. The OCR engine Tesseract itself is delivered under the Apache 2.0 license and we support a version compiled for windows only.框架

a PDF to text converter. less

a text document parser.ide

The document recognition process can therefore be divided in 2 steps:ui

The component takes an image file (tif, png, jpg....) or a PDF file and returns the text contained in it. The Java wrapper will perform this operation by using Tesseract. Alternatively you can use any other OCR engine. If you are however using a PDF file, you will use our PDF to Text converter.this

In the second step, your Java application needs to understand the text returned by the OCR engine or PDF converter. This is done by the document parser. The document parser uses as input as text string (the data) and a xml file that describes the structure of the document and the ouput is a business document either as a Java object or as a XML file

 

 

 

JAVA實現百度OCR文字識別功能 - 張榮珍的專欄 - 博客頻道 - CSDN.NET.html

做者:: 綽號:老哇的爪子 ( 全名::Attilax Akbar Al Rapanui 阿提拉克斯 阿克巴 阿爾 拉帕努伊 ) 

漢字名:艾提拉(艾龍)   EMAIL:1466519819@qq.com

轉載請註明來源: http://www.cnblogs.com/attilax/

Atiend

相關文章
相關標籤/搜索