獲取Tesseract源碼的方式有不少。能夠直接從repo獲取,也能夠下載壓縮包。不過編譯的時候每每也會出現各類奇怪的問題。這裏介紹如何簡單的配置和編譯源碼。html
參考原文:How to Build Tesseract OCR Library on Windowsexpress
安裝過程當中勾選Tesseract development files:ui
在安裝目錄中找到vs2008到工程目錄:
google
找到全部編譯相關的庫:spa
打開Visual Studio 2008(沒有的能夠去官網下載express版本),導入工程編譯。最後生成DEBUG和RELEASE兩個版本的DLL:libtesseract302d.dll ,libtesseract302.dllcode
在README中注意這段話:orm
Dependencies and Licenses ========================= Leptonica is required. (www.leptonica.com). Tesseract no longer compiles without Leptonica. Libtiff is no longer required as a direct dependency.
Tesseract依賴Leptonica庫,因此再看下Leptonica是怎麼編譯的。
xml
Leptonica是C語言編寫的一個圖像處理庫,支持JPEG, PNG, TIFF,GIF。htm
把三個包解壓,並按照下面的結構組建編譯環境:
BuildFolder\ include\ leptonica-1.68\ lib\
BuildFolder\leptonica-1.68 contents:
config\ Not used for Windows builds prog\ Regression tests, examples, utilities src\ Source files for liblept vs2008\ Visual Studio 2008 specific files DLL Debug\ liblept DLL Debug build output DLL Release\ liblept DLL Release build output LIB Debug\ liblept LIB Debug build output LIB Release\ liblept LIB Release build output prog_projects\ Projects for prog programs ioformats_reg\ Sample project for prog\ioformats_reg.exe DLL Debug\ DLL Debug build output for sample project DLL Release\ DLL Release build output for sample project LIB Debug\ LIB Debug build output for sample project LIB Release\ LIB Release build output for sample project ioformats_reg.vcproj The ioformats_reg project file leptonica.sln The Leptonica solution file leptonica.vcproj The Leptonica project file
打開Visual Studio 2008,導入工程編譯。最後生成DEBUG和RELEASE兩個版本的DLL:liblept168d.dll,liblept168.dll