python 全棧開發，Day123(圖靈機器人,web錄音實現自動化交互問答)

時間 2019-11-17

標籤 python 開發 day123 day 圖靈機器人 web 錄音實現自動化交互問答欄目 Python 简体版

原文原文鏈接

昨日內容回顧

1. 百度ai開放平臺
2. AipSpeech技術，語言合成，語言識別
3. Nlp技術，短文本類似度
4. 實現一個簡單的問答機器人
5. 語言識別 ffmpeg (目前全部音樂,視頻領域,這個工具應用很是普遍)
    在不要求採樣率的狀況下，它會根據文件後綴名自動轉換
    ffmpeg a.mp3 a.wav

1、圖靈機器人

介紹

圖靈機器人是以語義技術爲核心驅動力的人工智能公司，致力於「讓機器理解世界」，產品服務包括機器人開放平臺、機器人OS和場景方案。javascript

官方地址爲：html

http://www.tuling123.com/前端

使用

首先得註冊一個帳號，或者使用第3方登陸，均可以。html5

登陸以後，點擊建立機器人java

機器人名稱，能夠是本身定義的名字python

選擇網站->教育學習->其餘輸入簡介git

建立成功以後，點擊終端設置，拉到最後。github

能夠看到api接入，下面有一個apikey，待會會用到web

右側有一個窗口，能夠和機器人聊天ajax

能夠設置它的我的信息

測試聊天

星座下面的功能都要花錢的

技能擴展，能夠全開

使用api

點擊api使用文檔，1.0的api已經下線了。目前只有2.0

https://www.kancloud.cn/turing/www-tuling123-com/718227

編碼方式

UTF-8（調用圖靈API的各個環節的編碼方式均爲UTF-8）

接口地址

http://openapi.tuling123.com/openapi/api/v2

請求方式

HTTP POST

請求參數

請求參數格式爲 json
請求示例：

{
    "reqType":0,
    "perception": {
        "inputText": {
            "text": "附近的酒店"
        },
        "inputImage": {
            "url": "imageUrl"
        },
        "selfInfo": {
            "location": {
                "city": "北京",
                "province": "北京",
                "street": "信息路"
            }
        }
    },
    "userInfo": {
        "apiKey": "",
        "userId": ""
    }
}

View Code

舉例：

新建文件 tuling.py，詢問天氣

import requests
import json

apiKey = "6a944508fd5c4d499b9991862ea12345"

userId = "xiao"  # 名字能夠隨意,必須是英文
data = {
    # 請求的類型 0 文本 1 圖片 2 音頻
    "reqType": 0,
    # // 輸入信息(必要參數)
    "perception": {
        # 文本信息
        "inputText": {
            # 問題
            "text": "北京將來七天，天氣怎麼樣"
        }
    },
    # 用戶必要信息
    "userInfo": {
        # 圖靈機器人的apikey
        "apiKey": apiKey,
        # 用戶惟一標識
        "userId": userId
    }
}

tuling_url = "http://openapi.tuling123.com/openapi/api/v2"

res = requests.post(tuling_url,json=data)  # 請求url
# 將返回信息解碼
res_dic = json.loads(res.content.decode("utf-8"))  # type:dict
# 獲得返回信息中的文本信息
res_type = res_dic.get("results")[0].get("values").get("text")
print(res_type)

View Code

執行輸出：

北京:週二 09月11日,多雲 南風微風,最低氣溫19度，最高氣溫26度

那麼輸出的文本，能夠調用百度api，轉換爲音頻文件，並自動播放！

修改 baidu_ai.py，封裝函數text2audio

import os
from aip import AipSpeech
from aip import AipNlp

""" 你的 APPID AK SK """
APP_ID = '11212345'
API_KEY = 'pVxdhsXS1BIaiwYYNT712345'
SECRET_KEY = 'BvHQOts27LpGFbt3RAOv84WfPCW12345'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
nlp_client = AipNlp(APP_ID, API_KEY, SECRET_KEY)

# 讀取音頻文件函數
def get_file_content(filePath):
    cmd_str = "ffmpeg -y  -i %s  -acodec pcm_s16le -f s16le -ac 1 -ar 16000 %s.pcm"%(filePath,filePath)
    os.system(cmd_str)  # 調用系統命令ffmpeg,傳入音頻文件名便可
    with open(filePath + ".pcm", 'rb') as fp:
        return fp.read()

def text2audio(text):  # 文本轉換爲音頻
    ret = client.synthesis(text, 'zh', 1, {'spd': 4, 'vol': 5, 'pit': 8, 'per': 4})
    if not isinstance(ret, dict):
        with open('audio.mp3', 'wb') as f:
            f.write(ret)

        os.system("audio.mp3")  # 打開系統默認的音頻播放器

View Code

修改tuling.py，調用函數text2audio

import requests
import json
import baidu_ai

apiKey = "6a944508fd5c4d499b9991862ea12345"

userId = "xiao"  # 名字能夠隨意,必須是英文
data = {
    # 請求的類型 0 文本 1 圖片 2 音頻
    "reqType": 0,
    # // 輸入信息(必要參數)
    "perception": {
        # 文本信息
        "inputText": {
            # 問題
            "text": "北京將來七天，天氣怎麼樣"
        }
    },
    # 用戶必要信息
    "userInfo": {
        # 圖靈機器人的apikey
        "apiKey": apiKey,
        # 用戶惟一標識
        "userId": userId
    }
}

tuling_url = "http://openapi.tuling123.com/openapi/api/v2"

res = requests.post(tuling_url,json=data)  # 請求url
# 將返回信息解碼
res_dic = json.loads(res.content.decode("utf-8"))  # type:dict
# 獲得返回信息中的文本信息
result = res_dic.get("results")[0].get("values").get("text")
# print(res_type)

baidu_ai.text2audio(result)

View Code

執行tuling.py，它會自動打開音頻播放器，說：北京:週二 09月11日,多雲南風微風,最低氣溫19度，最高氣溫26度

關於圖靈機器人的參數說明，這裏有一份別人整理好的

圖靈機器人2.0
POST: http://openapi.tuling123.com/openapi/api/v2

實現參數:
{
    // 返回值類型 0 文本 1圖片 2音頻
    "reqType":0, 
    // 輸入信息(必要參數)
    "perception": { 
        // 文本信息 三者非必填,但必有一填
        "inputText": {
        // 文本問題
        "text": "附近的酒店"
        },
        // 圖片信息
        "inputImage": {
            // 提交圖片地址
            "url": "imageUrl"
        },
        // 音頻信息
        "inputMedia": {
            // 提交音頻地址
            "url":"mediaUrl"
        }
        // 客戶端屬性(非必要)
        "selfInfo": {
            // 地理位置信息(非必要)
            "location": {
                // 城市
                "city": "北京",
                // 省份
                "province": "北京",
                // 街道
                "street": "信息路"
            }
        }
    },
    // 用戶參數信息(原版的userid)
    "userInfo": {
        // apikey 應用的key
        "apiKey": "",
        // 用戶惟一標誌
        "userId": ""
    }
}



{
    // 請求意圖
    "intent": {
            //  輸出功能code
            "code": 10005,
            //  意圖名稱
            "intentName": "",
            //  意圖動做名稱
            "actionName": "",
            //  功能相關參數
            "parameters": {
                "nearby_place": "酒店"
            }
    },
    // 輸出結果集
    "results": [
        {
            // 返回組 相同的 GroupType 爲一組 0爲獨立
            "groupType": 1,
            // 返回值類型 :  文本(text);鏈接(url);音頻(voice);視頻(video);圖片(image);圖文(news)
            "resultType": "url",
            // 返回值
            "values": {
                "url": "http://m.elong.com/hotel/0101/nlist/#indate=2016-12-10&outdate=2016-12-11&keywords=%E4%BF%A1%E6%81%AF%E8%B7%AF

"
            }
        },
        {
            // 此GroupType與 1 同組
            "groupType": 1,
            "resultType": "text",
            "values": {
                "text": "親，已幫你找到相關酒店信息"
            }
        }
    ]
}

View Code

或者參數官方API文檔：

https://www.kancloud.cn/turing/www-tuling123-com/718227

接下來，仍是使用前面的 whatyouname.m4a。

當問到你的名字叫什麼時？說出：我叫小青龍

當問到其餘問題時，由圖靈機器人回答

修改 baidu_ai.py

from aip import AipSpeech
import time, os
from baidu_nlp import nlp_client
import tuling

""" 你的 APPID AK SK """
APP_ID = '11212345'
API_KEY = 'pVxdhsXS1BIaiwYYNT712345'
SECRET_KEY = 'BvHQOts27LpGFbt3RAOv84WfPCW12345'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
# nlp_client = AipNlp(APP_ID, API_KEY, SECRET_KEY)

# 讀取音頻文件函數
def get_file_content(filePath):
    cmd_str = "ffmpeg -y  -i %s  -acodec pcm_s16le -f s16le -ac 1 -ar 16000 %s.pcm"%(filePath,filePath)
    os.system(cmd_str)  # 調用系統命令ffmpeg,傳入音頻文件名便可
    with open(filePath + ".pcm", 'rb') as fp:
        return fp.read()

def text2audio(text):  # 文本轉換爲音頻
    ret = client.synthesis(text, 'zh', 1, {'spd': 4, 'vol': 5, 'pit': 8, 'per': 4})
    if not isinstance(ret, dict):
        with open('audio.mp3', 'wb') as f:
            f.write(ret)

        os.system("audio.mp3")  # 打開系統默認的音頻播放器

# 識別本地文件
def audio2text(file_path):
    a = client.asr(get_file_content(file_path), 'pcm', 16000, {
        'dev_pid': 1536,
    })

    # print(a["result"])
    if a.get("result") :
        return a.get("result")[0]

def my_nlp(q,uid):
    a = "我不知道你在說什麼"
    if nlp_client.simnet(q,"你的名字叫什麼").get("score") >= 0.7:
        a = "我叫小青龍"
        return a

    a = tuling.to_tuling(q,uid)
    return a

View Code

修改 baidu_nlp.py

from aip import AipNlp

APP_ID = '11212345'
API_KEY = 'pVxdhsXS1BIaiwYYNT712345'
SECRET_KEY = 'BvHQOts27LpGFbt3RAOv84WfPCW12345'

nlp_client = AipNlp(APP_ID,API_KEY,SECRET_KEY)

""" 調用短文本類似度 """
res = nlp_client.simnet("你叫什麼名字","你的名字叫什麼")
print(res)

# 若是類似度達到70%
if res.get("score") > 0.7:
    print("我叫青龍")

View Code

修改tuling.py

import requests
import json

apiKey = "6a944508fd5c4d499b9991862ea12345"

userId = "xiao"  # 名字能夠隨意,必須是英文
data = {
    # 請求的類型 0 文本 1 圖片 2 音頻
    "reqType": 0,
    # // 輸入信息(必要參數)
    "perception": {
        # 文本信息
        "inputText": {
            # 問題
            "text": "北京今每天氣怎麼樣"
        }
    },
    # 用戶必要信息
    "userInfo": {
        # 圖靈機器人的apikey
        "apiKey": apiKey,
        # 用戶惟一標識
        "userId": userId
    }
}

tuling_url = "http://openapi.tuling123.com/openapi/api/v2"

def to_tuling(q,user_id):
    # 修改請求參數中的inputText，也就是問題
    data["perception"]["inputText"]["text"] = q
    # 修改userInfo
    data["userInfo"]["userId"] = user_id

    res = requests.post(tuling_url,json=data)  # 請求url
    # 將返回信息解碼
    res_dic = json.loads(res.content.decode("utf-8"))  # type:dict
    # 獲得返回信息中的文本信息
    result = res_dic.get("results")[0].get("values").get("text")
    # print(res_type)

    return result

View Code

建立main.py

import baidu_ai

uid = 1234
file_name = "whatyouname.m4a"
q = baidu_ai.audio2text(file_name)
# print(q,'qqqqqqqqqq')
a = baidu_ai.my_nlp(q,uid)
# print(a,'aaaaaaaaa')
baidu_ai.text2audio(a)

View Code

執行main.py，執行以後，會打開音頻，說：我叫小青龍

修改 baidu_ai.py，註釋掉問題：你的名字叫什麼

from aip import AipSpeech
import time, os
from baidu_nlp import nlp_client
import tuling

""" 你的 APPID AK SK """
APP_ID = '11212345'
API_KEY = 'pVxdhsXS1BIaiwYYNT712345'
SECRET_KEY = 'BvHQOts27LpGFbt3RAOv84WfPCW12345'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
# nlp_client = AipNlp(APP_ID, API_KEY, SECRET_KEY)

# 讀取音頻文件函數
def get_file_content(filePath):
    cmd_str = "ffmpeg -y  -i %s  -acodec pcm_s16le -f s16le -ac 1 -ar 16000 %s.pcm"%(filePath,filePath)
    os.system(cmd_str)  # 調用系統命令ffmpeg,傳入音頻文件名便可
    with open(filePath + ".pcm", 'rb') as fp:
        return fp.read()

def text2audio(text):  # 文本轉換爲音頻
    ret = client.synthesis(text, 'zh', 1, {'spd': 4, 'vol': 5, 'pit': 8, 'per': 4})
    if not isinstance(ret, dict):
        with open('audio.mp3', 'wb') as f:
            f.write(ret)

        os.system("audio.mp3")  # 打開系統默認的音頻播放器

# 識別本地文件
def audio2text(file_path):
    a = client.asr(get_file_content(file_path), 'pcm', 16000, {
        'dev_pid': 1536,
    })

    # print(a["result"])
    if a.get("result") :
        return a.get("result")[0]

def my_nlp(q,uid):
    # a = "我不知道你在說什麼"
    # if nlp_client.simnet(q,"你的名字叫什麼").get("score") >= 0.7:
    #     a = "我叫小青龍"
    #     return a

    a = tuling.to_tuling(q,uid)
    return a

View Code

再次執行main.py，執行以後，會打開音頻，說：叫我圖靈機器人就能夠了！

這樣很麻煩，每次問問題，都要錄製一段音頻才能夠！

接下來介紹使用web錄音，實現自動化交互問答

2、web錄音實現自動化交互問答

werkzeug

首先，先向你們介紹一下什麼是 werkzeug，Werkzeug是一個WSGI工具包，他能夠做爲一個Web框架的底層庫。這裏稍微說一下， werkzeug 不是一個web服務器，也不是一個web框架，而是一個工具包，官方的介紹說是一個 WSGI 工具包，它能夠做爲一個 Web 框架的底層庫，由於它封裝好了不少 Web 框架的東西，例如 Request，Response 等等。

例如我最經常使用的 Flask 框架就是一 Werkzeug 爲基礎開發的，它只能處理HTTP請求

WebSocket

WebSocket 是一種網絡通訊協議。RFC6455 定義了它的通訊標準。

WebSocket 是 HTML5 開始提供的一種在單個 TCP 鏈接上進行全雙工通信的協議。

爲何不用werkzeug

HTTP 協議是一種無狀態的、無鏈接的、單向的應用層協議。HTTP 協議沒法實現服務器主動向客戶端發起消息！

WebSockets 是長鏈接(鏈接長期存在)，Web瀏覽器和服務器都必須實現 WebSockets 協議來創建和維護鏈接

這裏使用flask做爲後端程序，使用websocket來接收前端發送的音頻。由於不知道用戶啥時候發起錄音！

正式開始

新建一個文件夾web_ai

建立文件ai.py，使用websocket監聽！

from flask import Flask,request,render_template,send_file
from geventwebsocket.handler import WebSocketHandler
from gevent.pywsgi import WSGIServer
from geventwebsocket.websocket import WebSocket

app = Flask(__name__)

@app.route("/index")
def index():
    # 獲取請求的WebSocket對象
    user_socket = request.environ.get("wsgi.websocket") # type:WebSocket
    print(user_socket)
    print(request.remote_addr)  # 遠程ip地址
    while True:
        # 接收消息
        msg = user_socket.receive()
        print(msg)

@app.route("/")
def home_page():
    return render_template("index.html")

if __name__ == '__main__':
    # 建立一個WebSocket服務器
    http_serv = WSGIServer(("0.0.0.0",5000),app,handler_class=WebSocketHandler)
    # 開始監聽HTTP請求
    http_serv.serve_forever()

View Code

建立目錄templates，在此目錄下，新建文件index.html，建立 WebSocket 對象

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>

</head>
<body>

</body>

<script type="application/javascript">
    //建立 WebSocket 對象
    var ws = new WebSocket("ws://127.0.0.1:5000/index");

</script>
</html>

View Code

啓動flask，訪問首頁：

注意：此時頁面是空白的，不要驚訝！

查看Pycharm控制檯輸出：

<geventwebsocket.websocket.WebSocket object at 0x000002EA6A3F39A0>
127.0.0.1

那麼網頁如何發送音頻給後端呢？使用Recorder.js

Recorder

Recorder.js是HTML5錄音插件，它能夠實如今線錄音。

它不支持ie，不支持Safari 其餘ok，可是部分版本有點小要求
Chrome47以上以及QQ瀏覽器須要HTTPS的支持。注意：公網訪問時，網頁必須是HTTPS方式，不然沒法錄音！

github下載地址爲：

https://github.com/mattdiamond/Recorderjs

關於html5 Audio經常使用屬性和函數事件，請參考連接：

https://blog.csdn.net/bright2017/article/details/80041448

下載以後，解壓文件。進入dict目錄，將recorder.js複製到桌面上！

打開flask項目web_ai，進入目錄static，將recorder.js移動到此目錄

項目結構以下：

./
├── ai.py
├── static
│   └── recorder.js
└── templates
    └── index.html

錄製聲音

修改index.html，導入recorder.js

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>

</head>
<body>
{#audio是HTML5的標籤,autoplay表示自動播放,controls表示展現組件#}
<audio src="" autoplay controls id="player"></audio>
<br>
<button onclick="start_reco()">開始廢話</button>
<br>
<button onclick="stop_reco()">發送語音</button>
</body>
<script src="/static/recorder.js"></script>
<script type="application/javascript">
    // 建立WebSocket對象
    var ws = new WebSocket("ws://127.0.0.1:5000/index");
    var reco = null;  //錄音對象
    // 建立AudioContext對象
    // AudioContext() 構造方法建立了一個新的 AudioContext 對象 它表明了一個由音頻模塊連接而成的音頻處理圖, 每個模塊由 AudioNode 表示
    var audio_context = new AudioContext();
    //要獲取音頻和視頻，須要用到getUserMedia。桌面平臺支持的瀏覽器包括Chrome, Firefox, Opera和Edge。
    // 這裏的|| 表示或者的關係，也就是能支持的瀏覽器
    navigator.getUserMedia = (navigator.getUserMedia ||
        navigator.webkitGetUserMedia ||
        navigator.mozGetUserMedia ||
        navigator.msGetUserMedia);

    // 拿到媒體對象，容許音頻對象
    navigator.getUserMedia({audio: true}, create_stream, function (err) {
        console.log(err)
    });

    //建立媒體流容器
    function create_stream(user_media) {
        //AudioContext接口的createMediaStreamSource()方法用於建立一個新的MediaStreamAudioSourceNode 對象,
        // 須要傳入一個媒體流對象(MediaStream對象)(能夠從 navigator.getUserMedia 得到MediaStream對象實例),
        // 而後來自MediaStream的音頻就能夠被播放和操做。
        // MediaStreamAudioSourceNode 接口表明一個音頻接口，是WebRTC MediaStream (好比一個攝像頭或者麥克風)的一部分。
        // 是個表現爲音頻源的AudioNode。
        var stream_input = audio_context.createMediaStreamSource(user_media);
        // 給Recoder 建立一個空間，麥克風說的話，均可以錄入。是一個流
        reco = new Recorder(stream_input);
    }

    function start_reco() {  //開始錄音
        reco.record(); //往裏面寫流
    }

    function stop_reco() {  //中止錄音
        reco.stop();  //中止寫入流
        get_audio();  //調用自定義方法
        reco.clear(); //清空容器
    }

    // 獲取音頻
    function get_audio() {
        reco.exportWAV(function (wav_file) {
            // 發送數據給後端
            ws.send(wav_file);
        })
    }

</script>
</html>

View Code

重啓flask，訪問網頁，效果以下：

點擊容許麥克風

點擊開始廢話，說一段話，再點擊中止！

查看Pycharm控制檯輸出：

<geventwebsocket.websocket.WebSocket object at 0x000002515BFE3C10>
127.0.0.1
bytearray(b'RIFF$\x00\x04\x00WAVEfmt...\x10')

它返回一個bytearray數據，這些都是流數據，它能夠保存爲音頻文件

修改ai.py，判斷類型爲bytearray，寫入文件

from flask import Flask,request,render_template,send_file
from geventwebsocket.handler import WebSocketHandler
from gevent.pywsgi import WSGIServer
from geventwebsocket.websocket import WebSocket

app = Flask(__name__)

@app.route("/index")
def index():
    # 獲取請求的WebSocket對象
    user_socket = request.environ.get("wsgi.websocket") # type:WebSocket
    print(user_socket)
    print(request.remote_addr)  # 遠程ip地址
    while True:
        # 接收消息
        msg = user_socket.receive()
        if type(msg) == bytearray:
            # 寫入文件123.wav
            with open("123.wav", "wb") as f:
                f.write(msg)

@app.route("/")
def home_page():
    return render_template("index.html")

if __name__ == '__main__':
    # 建立一個WebSocket服務器
    http_serv = WSGIServer(("0.0.0.0",5000),app,handler_class=WebSocketHandler)
    # 開始監聽HTTP請求
    http_serv.serve_forever()

View Code

重啓flask，從新錄製一段聲音。就會發現項目目錄，多了一個文件123.wav

打開這文件，播放一下，就是剛剛錄製的聲音！

獲取文件名

將上一篇寫的baidu_ai.py和tuling.py複製過來。

修改 baidu_ai.py，修改text2audio函數，返回文件名

from aip import AipSpeech
import time, os
# from baidu_nlp import nlp_client
import tuling

""" 你的 APPID AK SK """
APP_ID = '11212345'
API_KEY = 'pVxdhsXS1BIaiwYYNT712345'
SECRET_KEY = 'BvHQOts27LpGFbt3RAOv84WfPCW12345'

client = AipSpeech(APP_ID, API_KEY, SECRET_KEY)
# nlp_client = AipNlp(APP_ID, API_KEY, SECRET_KEY)

# 讀取音頻文件函數
def get_file_content(filePath):
    cmd_str = "ffmpeg -y  -i %s  -acodec pcm_s16le -f s16le -ac 1 -ar 16000 %s.pcm"%(filePath,filePath)
    os.system(cmd_str)  # 調用系統命令ffmpeg,傳入音頻文件名便可
    with open(filePath + ".pcm", 'rb') as fp:
        return fp.read()

def text2audio(text):  # 文本轉換爲音頻
    ret = client.synthesis(text, 'zh', 1, {'spd': 4, 'vol': 5, 'pit': 8, 'per': 4})
    if not isinstance(ret, dict):
        with open('audio.mp3', 'wb') as f:
            f.write(ret)

        # os.system("audio.mp3")  # 打開系統默認的音頻播放器
    return 'audio.mp3'

# 識別本地文件
def audio2text(file_path):
    a = client.asr(get_file_content(file_path), 'pcm', 16000, {
        'dev_pid': 1536,
    })

    # print(a["result"])
    if a.get("result") :
        return a.get("result")[0]

def my_nlp(q,uid):
    # a = "我不知道你在說什麼"
    # if nlp_client.simnet(q,"你的名字叫什麼").get("score") >= 0.7:
    #     a = "我叫小青龍"
    #     return a

    a = tuling.to_tuling(q,uid)
    return a

View Code

修改 tuling.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import requests
import json

apiKey = "6a944508fd5c4d499b9991862ea12345"

userId = "xiao"  # 名字能夠隨意,必須是英文
data = {
    # 請求的類型 0 文本 1 圖片 2 音頻
    "reqType": 0,
    # // 輸入信息(必要參數)
    "perception": {
        # 文本信息
        "inputText": {
            # 問題
            "text": "北京今每天氣怎麼樣"
        }
    },
    # 用戶必要信息
    "userInfo": {
        # 圖靈機器人的apikey
        "apiKey": apiKey,
        # 用戶惟一標識
        "userId": userId
    }
}

tuling_url = "http://openapi.tuling123.com/openapi/api/v2"

def to_tuling(q,user_id):
    # 修改請求參數中的inputText，也就是問題
    data["perception"]["inputText"]["text"] = q
    # 修改userInfo
    data["userInfo"]["userId"] = user_id

    res = requests.post(tuling_url,json=data)  # 請求url
    # 將返回信息解碼
    res_dic = json.loads(res.content.decode("utf-8"))  # type:dict
    # 獲得返回信息中的文本信息
    result = res_dic.get("results")[0].get("values").get("text")
    # print(res_type)

    return result

View Code

修改ai.py，導入模塊baidu_ai

from flask import Flask,request,render_template,send_file
from geventwebsocket.handler import WebSocketHandler
from gevent.pywsgi import WSGIServer
from geventwebsocket.websocket import WebSocket
import baidu_ai

app = Flask(__name__)

@app.route("/index/<uid>")
def index(uid):  # 接收uid
    # 獲取請求的WebSocket對象
    user_socket = request.environ.get("wsgi.websocket") # type:WebSocket
    print(user_socket)
    # print(request.remote_addr)  # 遠程ip地址
    while True:
        # 接收消息
        msg = user_socket.receive()
        if type(msg) == bytearray:
            # 寫入文件123.wav
            with open("123.wav", "wb") as f:
                f.write(msg)

            # 將音頻文件轉換爲文字
            res_q = baidu_ai.audio2text("123.wav")
            # 調用my_nlp函數,內部調用圖靈機器人
            res_a = baidu_ai.my_nlp(res_q, uid)
            # 將文字轉換爲音頻文件
            file_name = baidu_ai.text2audio(res_a)
            # 發送文件名給前端
            user_socket.send(file_name)

@app.route("/")
def home_page():
    return render_template("index.html")

@app.route("/get_file/<file_name>")  # 獲取音頻文件
def get_file(file_name):  # 此方法用於前端調取後端的音頻文件,用於自動播放
    return send_file(file_name)

if __name__ == '__main__':
    # 建立一個WebSocket服務器
    http_serv = WSGIServer(("0.0.0.0",5000),app,handler_class=WebSocketHandler)
    # 開始監聽HTTP請求
    http_serv.serve_forever()

View Code

修改index.html，定義ws.onmessage，打印文件名

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>

</head>
<body>
{#audio是HTML5的標籤,autoplay表示自動播放,controls表示展現組件#}
<audio src="" autoplay controls id="player"></audio>
<br>
<button onclick="start_reco()">開始廢話</button>
<br>
<button onclick="stop_reco()">發送語音</button>
</body>
<script src="/static/recorder.js"></script>
<script type="application/javascript">
    // 建立WebSocket對象,index後面的是userid,是圖靈機器人須要的
    var ws = new WebSocket("ws://127.0.0.1:5000/index/xiao");
    var reco = null;  //錄音對象
    // 建立AudioContext對象
    // AudioContext() 構造方法建立了一個新的 AudioContext 對象 它表明了一個由音頻模塊連接而成的音頻處理圖, 每個模塊由 AudioNode 表示
    var audio_context = new AudioContext();
    //要獲取音頻和視頻，須要用到getUserMedia。桌面平臺支持的瀏覽器包括Chrome, Firefox, Opera和Edge。
    // 這裏的|| 表示或者的關係，也就是能支持的瀏覽器
    navigator.getUserMedia = (navigator.getUserMedia ||
        navigator.webkitGetUserMedia ||
        navigator.mozGetUserMedia ||
        navigator.msGetUserMedia);

    // 拿到媒體對象，容許音頻對象
    navigator.getUserMedia({audio: true}, create_stream, function (err) {
        console.log(err)
    });

    //建立媒體流容器
    function create_stream(user_media) {
        //AudioContext接口的createMediaStreamSource()方法用於建立一個新的MediaStreamAudioSourceNode 對象,
        // 須要傳入一個媒體流對象(MediaStream對象)(能夠從 navigator.getUserMedia 得到MediaStream對象實例),
        // 而後來自MediaStream的音頻就能夠被播放和操做。
        // MediaStreamAudioSourceNode 接口表明一個音頻接口，是WebRTC MediaStream (好比一個攝像頭或者麥克風)的一部分。
        // 是個表現爲音頻源的AudioNode。
        var stream_input = audio_context.createMediaStreamSource(user_media);
        // 給Recoder 建立一個空間，麥克風說的話，均可以錄入。是一個流
        reco = new Recorder(stream_input);
    }

    function start_reco() {  //開始錄音
        reco.record(); //往裏面寫流
    }

    function stop_reco() {  //中止錄音
        reco.stop();  //中止寫入流
        get_audio();  //調用自定義方法
        reco.clear(); //清空容器
    }

    // 獲取音頻
    function get_audio() {
        reco.exportWAV(function (wav_file) {
            // 發送數據給後端
            ws.send(wav_file);
        })
    }

    // 接收到服務端數據時觸發
    ws.onmessage = function (data) {
        console.log(data.data);  //打印文件名
    }

</script>
</html>

View Code

重啓flask，訪問網頁，從新錄製一段聲音

查看Pycharm控制檯輸出：

      encoder         : Lavc58.19.102 pcm_s16le
size=      35kB time=00:00:01.10 bitrate= 256.0kbits/s speed=42.6x    
video:0kB audio:35kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000000%

它正在將文字轉換爲音頻文件，並返回音頻的文件名

上面執行完成以後，網頁的console，就會返回文件名

這個文件名，就是text2audio函數返回的。

自動播放

那麼頁面如何自動播放這個audio.mp3文件呢？

只要修改網頁id爲player的src屬性就能夠了，路徑必須是能夠訪問的！

修改index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Title</title>

</head>
<body>
{#audio是HTML5的標籤,autoplay表示自動播放,controls表示展現組件#}
<audio src="" autoplay controls id="player"></audio>
<br>
<button onclick="start_reco()">開始廢話</button>
<br>
<button onclick="stop_reco()">發送語音</button>
</body>
<script src="/static/recorder.js"></script>
<script type="application/javascript">
    // 訪問後端的get_file,獲得一個文件名
    var get_file = "http://127.0.0.1:5000/get_file/";
    // 建立WebSocket對象,index後面的是userid,是圖靈機器人須要的
    var ws = new WebSocket("ws://127.0.0.1:5000/index/xiao");
    var reco = null;  //錄音對象
    // 建立AudioContext對象
    // AudioContext() 構造方法建立了一個新的 AudioContext 對象 它表明了一個由音頻模塊連接而成的音頻處理圖, 每個模塊由 AudioNode 表示
    var audio_context = new AudioContext();
    //要獲取音頻和視頻，須要用到getUserMedia。桌面平臺支持的瀏覽器包括Chrome, Firefox, Opera和Edge。
    // 這裏的|| 表示或者的關係，也就是能支持的瀏覽器
    navigator.getUserMedia = (navigator.getUserMedia ||
        navigator.webkitGetUserMedia ||
        navigator.mozGetUserMedia ||
        navigator.msGetUserMedia);

    // 拿到媒體對象，容許音頻對象
    navigator.getUserMedia({audio: true}, create_stream, function (err) {
        console.log(err)
    });

    //建立媒體流容器
    function create_stream(user_media) {
        //AudioContext接口的createMediaStreamSource()方法用於建立一個新的MediaStreamAudioSourceNode 對象,
        // 須要傳入一個媒體流對象(MediaStream對象)(能夠從 navigator.getUserMedia 得到MediaStream對象實例),
        // 而後來自MediaStream的音頻就能夠被播放和操做。
        // MediaStreamAudioSourceNode 接口表明一個音頻接口，是WebRTC MediaStream (好比一個攝像頭或者麥克風)的一部分。
        // 是個表現爲音頻源的AudioNode。
        var stream_input = audio_context.createMediaStreamSource(user_media);
        // 給Recoder 建立一個空間，麥克風說的話，均可以錄入。是一個流
        reco = new Recorder(stream_input);
    }

    function start_reco() {  //開始錄音
        reco.record(); //往裏面寫流
    }

    function stop_reco() {  //中止錄音
        reco.stop();  //中止寫入流
        get_audio();  //調用自定義方法
        reco.clear(); //清空容器
    }

    // 獲取音頻
    function get_audio() {
        reco.exportWAV(function (wav_file) {
            // 發送數據給後端
            ws.send(wav_file);
        })
    }

    // 接收到服務端數據時觸發
    ws.onmessage = function (data) {
        // console.log(data.data);
        console.log(get_file + data.data);  //打印文件名
        // 修改id爲player的src屬性,
        document.getElementById("player").src = get_file + data.data;
    }

</script>
</html>