學習Python3 試了一下百度OCR和騰訊OCR

由於有個小功能,須要用一下OCR,因此先找了2家,百度和騰訊,如何開通,如何建立應用得到key等不做說明了java

 

百度的比較簡單,引用一個AipOcr所有搞定,代碼以下:python

from aip import AipOcr

#下面3個變量請自行更改
APP_ID = '1111118'
API_KEY = 'r011111111iAfy'
SECRET_KEY = 'ZKca1111111DK5XZrq'

aipOcr  = AipOcr(APP_ID, API_KEY, SECRET_KEY)

# 讀取圖片
filePath = "d:/temp/0001.png"
def get_file_content(filePath):
    with open(filePath, 'rb') as fp:
        return fp.read()

# 定義參數變量
options = {
  'detect_direction': 'true',
  'language_type': 'CHN_ENG',
}

# 調用通用文字識別接口
result = aipOcr.basicAccurate(get_file_content(filePath), options)

print(result)

騰訊的比較坑B,有python的庫,可是2.0的,這不重要,重要的是python的庫中其它的識別有,但沒有識別印刷體的,須要用http去請求,去NM的。app

多是剛學python,在ocr請求中的簽名讓我弄了一成天,MD,網上那些的簽名都是別的應用的,總之難死我了dom

後來,下載了它們的java版的sdk,看了一下他們的簽名代碼,而後通過結果比對,總算弄出來了post

所有代碼以下:url

import requests
import hmac
import hashlib
import base64
import time
import random


appid =  '12111173'
bucket = ""
secret_id ='AKIDI111RAjYU' # 參考官方文檔
secret_key = 'S2iRe011111iM6xlHo'  # 同上

expired = time.time() + 2592000
onceExpired = 0
current = time.time()
rdm = ''.join(random.choice("0123456789") for i in range(10))
info = "a=" + appid + "&b=" + bucket + "&k=" + secret_id + "&e=" + str(expired) + "&t=" + str(current) + "&r=" + str(rdm) + "&u=0&f="
print(info)
signature = bytes(info, encoding='utf-8')
secretkey = bytes(secret_key, encoding='utf-8')
my_sign = hmac.new(secretkey,signature, hashlib.sha1).digest()
bb= my_sign+signature
sign1 = base64.b64encode(bb)
sign2=str(sign1,'utf-8')
print(sign2)
url = "http://recognition.image.myqcloud.com/ocr/general"
headers = {'Host': 'recognition.image.myqcloud.com',
           "Authorization": sign2 ,
           }
files = {'appid': (None, appid),
         'bucket': (None, bucket),
         'image': ('1.jpg', open('d:/temp/0001.png', 'rb'), 'image/jpeg')
         }

r = requests.post(url, files=files, headers=headers)
responseinfo = r.content

print(responseinfo)

 

識別同一個圖片,百度的居然比不過,明顯的一個USD識別成了JSD,我ca。。。。。spa

相關文章
相關標籤/搜索