今天來講一下在python中如何使用百度的語音合成功能;即輸入一段文字,請求百度相關服務器後,會返回來一段二進制語音流,將這段二進制數據通過base64編碼返回給前端,前端解碼後,能夠播放出這段語音;也能夠將這段二進制語音流保存到一個wav文件中。前端
使用百度的語音合成,首先須要在百度的ai開放平臺上註冊。python
具體代碼以下:json
import asyncio from aiohttp import ClientSession def get_baidu_voice(text, baidu_voice_token): content_audio = { 'tex': text, 'tok': baidu_voice_token, 'cuid': 'default', 'ctp': '1', 'lan': 'zh', 'per': '4', } speech_url = 'https://tsn.baidu.com/text2audio?' headers = { # 'Content-Type': 'audio/mp3' 'Content-Type': 'application/json' } async with ClientSession() as session: async with session.post(url=speech_url, data=content_audio, headers=headers) as res: ret = await res.content.read() try: # 將bytes類型轉換爲str類型 ret_str = str(ret, encoding="utf-8") except Exception as e: # 正常返回 self.speech = base64.b64encode(ret) else: # 異常返回 ret_dict = json.loads(ret_str) if ret_dict["err_no"] == 502: raise RuntimeError("access token expired, please check") elif ret_dict["err_no"] == 501: raise RuntimeError("the input arguments is incorrect, please check") elif ret_dict["err_no"] == 503: raise RuntimeError("合成後端出錯") elif ret_dict["err_no"] == 500: raise RuntimeError("unsupport input")
須要傳入兩個參數,分別是:須要轉換成語音的文本和百度的語音token。後端
百度語音token的獲取方式以下:服務器
def get_baidu_voice_token(): # client_id 爲官網獲取的AK, client_secret 爲官網獲取的SK host = 'https://aip.baidubce.com/oauth/2.0/token?grant_type=client_credentials&client_id='+ client_id + '&client_secret=' + client_secret request = urllib.request.Request(host) request.add_header('Content-Type', 'application/json; charset=UTF-8') response = urllib.request.urlopen(request) content_bytes = response.read() content_dict = json.loads(str(content_bytes, encoding="utf-8")) if content_dict: # 28天過時 return content_dict["access_token"]
注意,token是有有效期的,須要定時獲取新的token。session
以上就是百度語音合成的調用,若有錯誤,歡迎交流指正!app