這是學習時的筆記,包含相關資料連接,有的當時沒有細看,記錄下來在須要的時候回顧。html
有些較混亂的部分,後續會再更新。python
歡迎感興趣的小夥伴一塊兒討論,跪求大神指點~c++
tags:voicegit
pyAudio
http://old.sebug.net/paper/books/scipydoc/wave_pyaudio.html
注:這部分是經過錄音設備給語音激活檢測傳輸語音流。github
"path/to/vad/audio_stream.py" #!usr/bin/env python # -*- coding: utf-8 -*- import numpy as np from pyaudio import PyAudio,paInt16 from datetime import datetime import wave from Tkinter import * import sys from ffnn import FFNNVADGeneral import logging # import chardet # 查看編碼 # define of params NUM_SAMPLES =160 FRAMERATE = 16000 CHANNELS = 1 SAMPWIDTH = 2 FORMAT = paInt16 TIME = 125 FRAMESHIFT = 160 def save_wave_file(filename,data): '''save the date to the wav file''' wf = wave.open(filename,'wb') wf.setnchannels(CHANNELS) wf.setsampwidth(SAMPWIDTH) wf.setframerate(FRAMERATE) wf.writeframes("".join(data)) # ""中間不能有空格,否則語音錄入會有不少中斷。 wf.close() def my_button(root,label_text,button_text,button_stop,button_func,stop_func): '''create label and button''' label = Label(root,text=label_text,width=30,height=3).pack() button = Button(root,text=button_text,command=button_func,anchor='center',width=30,height=3).pack() button = Button(root,text=button_stop,command=stop_func,anchor='center',width=30,height=3).pack() def record_wave(): '''open the input of wave''' pa = PyAudio() # 錄音 stream = pa.open(format=FORMAT, channels=CHANNELS, rate=FRAMERATE, input=True, frames_per_buffer=NUM_SAMPLES) #一個buffer存NUM_SAMPLES個字節,做爲一幀 vad = FFNNVADGeneral('/path/to/VAD/alex-master/alex/tools/vad_train/model_voip/vad_nnt_546_hu32_hl1_hla6_pf10_nf10_acf_1.0_mfr20000_mfl20000_mfps0_ts0_usec00_usedelta0_useacc0_mbo1_bs100.tffnn', filter_length=2, sample_rate=16000, framesize=512, frameshift=160, usehamming=True, preemcoef=0.97, numchans=26, ceplifter=22, numceps=12, enormalise=True, zmeansource=True, usepower=True, usec0=False, usecmn=False, usedelta=False, useacc=False, n_last_frames=10, n_prev_frames=10, lofreq=125, hifreq=3800, mel_banks_only=True) # 語音激活檢測神經網絡方法的類FFNNVADGeneral. save_buffer = [] count = 0 # logging設置,用於記錄日誌 logging.basicConfig(level=logging.INFO, filename='log.txt', filemode ='w', format='%(message)s') while count < TIME*4: string_audio_data = stream.read(NUM_SAMPLES) result = vad.decide(string_audio_data) frame = count*NUM_SAMPLES/float(FRAMESHIFT) time = count*NUM_SAMPLES/float(FRAMERATE) # time=frame*frameshift/framerate logging.info('frame: '+str(frame)+' time: '+str(time)+' prob: '+str(result)) # logging記錄字符串,用‘+’鏈接 save_buffer.append(string_audio_data) count += 1 #chardet.detect(string_audio_data) #查看編碼類型 print "." filename = datetime.now().strftime("%Y-%m-%d_%H_%M_%S")+".wav" save_wave_file(filename,save_buffer) save_buffer = [] print "filename,saved." def record_stop(): # stop record the wave sys.exit(0) def main(): root = Tk() root.geometry('300x200+200+200') root.title('record wave') my_button(root,"Record a wave","clik to record","stop recording",record_wave,record_stop) root.mainloop() if __name__ == "__main__": main() # error $ bt_audio_service_open: connect() failed: Connection refused (111) # 解決: 貌似有多餘藍牙庫卻沒有藍牙設備 $ sudo apt-get purge bluez-alsa # Warning $ ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side Cannot connect to server socket err = No such file or directory Cannot connect to server request channel jack server is not running or cannot be started # 是因爲usr/share/alsa/alsa.conf默認設置
sudo apt-get update #更新軟件源,最好使用國內的軟件源,如何配置源參考百度。 sudo apt-get upgrade #升級軟件包 sudo apt-get install alsa-utils alsa-tools alsa-tools-gui alsamixergui #安裝所需軟件包 # 查看音頻設備 $ arecord -l > card 0: PCH [HDA Intel PCH], device 0: ALC887-VD Analog [ALC887-VD Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 2: ALC887-VD Alt Analog [ALC887-VD Alt Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 # 機器有多於一個聲卡,能夠用下面的命令顯示出來 $ cat /proc/asound/cards > 0 [PCH ]: HDA-Intel - HDA Intel PCH HDA Intel PCH at 0xf7210000 irq 29 1 [NVidia ]: HDA-Intel - HDA NVidia HDA NVidia at 0xf7080000 irq 17 # 每個聲卡有一個card number和一個device number,能夠用下面命令顯示出來 $ aplay -l > card 0: PCH [HDA Intel PCH], device 0: ALC887-VD Analog [ALC887-VD Analog] Subdevices: 1/1 Subdevice #0: subdevice #0 card 0: PCH [HDA Intel PCH], device 1: ALC887-VD Digital [ALC887-VD Digital] Subdevices: 1/1 Subdevice #0: subdevice #0 card 1: NVidia [HDA NVidia], device 3: HDMI 0 [HDMI 0] Subdevices: 1/1 Subdevice #0: subdevice #0 card 1: NVidia [HDA NVidia], device 7: HDMI 1 [HDMI 1] Subdevices: 1/1 Subdevice #0: subdevice #0 # 錄音 $ arecord -D "plughw:0,0" -f S16_LE -r 16000 -d 5 -t wav file.wav # -D 選擇設備 試過hw:1,0 hw:0,2 只有hw:0,0能夠錄音 # -f 錄音格式 S16_LE表明有符號16位小端序 # -r 採樣率 # -t 錄音時長 # file.wav 文件名 # 不添加plug會有警示,由於是外置聲卡 Warning: rate is not accurate (requested = 16000Hz, got = 44100Hz) please, try the plug plugin # 驗證錄音 $ aplay file.wav
通常alsa設置了一個defaults設備,音頻播放軟件默認使用defaults設備輸出聲音。defaults設備定義在alsa.conf中,內容以下:vim
[plain] # # defaults # # show all name hints also for definitions without hint {} section defaults.namehint.showall off # show just basic name hints defaults.namehint.basic on # show extended name hints defaults.namehint.extended off # defaults.ctl.card 0 defaults.pcm.card 0 defaults.pcm.device 0 defaults.pcm.subdevice -1 ……
defaults會默認匹配card number和device number比較小的聲卡。
若是要修改,則修改/etc/asound.conf或~/.asoundrc。好比我要把defaults匹配到card 1,device 0上,則添加一下幾行:網絡
[plain] $ sudo vim /etc/asound.conf defaults.pcm.card 1 defaults.pcm.device 3 defaults.ctl.card 1
https://github.com/aaronaanderson/ofxPortSFapp
有些可能記錄時忘記記錄獲取信息的網站地址,有不當之處請指正~~
(若非特別聲明,文章均爲Vanessa的我的筆記,轉載請註明出處。文章若有侵權內容,請聯繫我,我會及時刪除)socket