使用Festivox建立Festival英語語音庫

所用版本:EST 2.1, Festival 2.1,  Festvox2.4(該版本兼容Festival2.1,官方發佈的Festvox2.1版本可能存在兼容問題)html

The process is fairly long, recording and labelling take a few hours each, and after that there ubuntu

may still be extensive hand correction. It also requires a fair amount of disk space so make sure iphone

you have at least 500 Meg free. The basic list of steps is as follows. ide

Create templates. 工具

Generate prompts. 測試

Record nonsense words. ui

Autolabel recorded words. this

Generate diphone index Generate pitchmarks and LPC co-efficients. google

Test Package for distribution. spa

0. Before continuing make sure that you have Speech tools, Festival, FestVox and EMU Label

correctly installed. Also make sure the following environment variable are set.

export ESTDIR=/home/...../speech_tools

export FESTVOXDIR=/home/...../FestVox

export PATH = $PATH:/home/...../Festival/bin

1. Make a directory and change to it. By convention the directory is named Insitution_language_name_type e.g.

mkdir ~/ru_us_matt_diphone

cd ~/ru_us_matt)diphone

2. FestVox provides a tool to build the basic directory structure. It takes institution, language and

name as arguments. E.g.

$FESTVOXDIR/src/diphones/setup_diphone ru us matt

The setup script also needs to copy in some language specific files. For US English the following

packages will need to be part of Festival.

FestVox_kallpc16k

festlex_POSLEX

festlex_CMU

(請參見博主另外一篇文章:編譯Festival手記)

3. The nonsense word list must be generated.

festival -b festvox/diphlist.scm festvox/us_schema.scm '(diphone-gen-schema "us" "etc/usdiph.list")'

4. The prompts must be synthesised so that Festival can prompt the user before recording the diphones.

festival -b festvox/diphlist.scm festvox/us_schema.scm '(diphone-gen-waves "prompt-wav" "prompt-lab" "etc/usdiph.list")'

耐心等待一長串的輸出。

5.如今能夠錄製diphone語音庫了:

bin/prompt_them etc/usdiph.list

注意,若是啓動festival須要使用命令 padsp festival的話,這裏的命令也必須寫成

padsp bin/prompt_them etc/usdiph.list

注意,在按下回車前必定要確保你能夠足夠集中精力,而且在一個安靜的環境中。由於接下來你要盯着屏幕不間斷的錄音兩個小時,產生1369個錄音片斷。

終端會提示將要錄下的音節,好比pau t aa t ae t aa pau,而後會輸出合成的語音,接下來提示開始錄音兩秒,而後對着麥克風說完這個音節串便可。

6.全部錄音完成以後就能夠使用自動標註(auto label)的腳本程序對錄音片斷自動標註了:

bin/make_labs prompt-wav/*.wav

7.若是須要手動修改標註,那麼官方文檔提供的工具emulabel已通過時了,如今能夠使用wavesurfer這個工具,若是是ubuntu系統能夠直接在源裏安裝(sudo apt-get install wavesurfer)。若是要修改pitchmark則還須要Wavesurfer Pitchmark Plugin

插件,目前能夠在http://mh21.de/pmedit/index.html下載(若是直接打開連接會404,那就從google搜索頁中打開吧)。把他下載到~/.wavesurfer/1.8/plugins裏面便可。根據郵件列表中一個朋友的描述,將wav/lab/mcep/pm/目錄中的文件所有拷貝到一個單獨的目錄中(此時個人mcep/pm/目錄是空的,見後)。而後使用wavesurfer打開一個錄音文件。此時提示使用什麼配置,選擇transcription。點擊右鍵->create pane ->Pitchmarks

注意,此時可能由於缺乏pitchmark文件(pm/)因此面板顯示一片空白。Festvox提供pitchmarklabel files互轉的腳本make_pm_pmlabmake_pmlab_pm。使用make_pmlab_pm腳本將lab/下的label文件轉換爲pm文件而後拷貝到上面說到的同一個目錄下便可:

bin/make_pmlab_pm lab/*.lab

8.如今diphone索引必須被創建:

mkdir dic

bin/make_diph_index etc/usdiph.list dic/mattdiph.est

這個腳本不會自動建立dic目錄,因此若是語言庫目錄下沒有dic目錄的話執行會出錯,因此執行命令前先建立dic目錄。

9.下一步是pitchmark的提取, and then moving it to the nearest peak.

不過首先須要拷貝etc/usdiph.list etc/txt.done.data,這是將要執行的兩個腳本中的小bug

cp etc/usdiph.list etc/txt.done.data

bin/make_pm_wave wav/*.wav

bin/make_pm_fix pm/*.pm

此條命令若是執行失敗,察看一下pm/下有沒有.pm文件,若是沒有,參考第7步使用make_pmlab_mp腳本。

10. You can optionally match the power, first the files must be analysed and a mean factor extracted

bin/find_powerfactors lab/*.lab (終於遇到個能夠執行的命令了……)

And finally you can use this to build the pitch-synchronous LPC coefficients

bin/make_lpc wav/*.wav

11.如今能夠測試一下咱們的語音庫了(Festival貌似不能使用ALSA輸出,因此在前面加上padsp):

padsp festival festvox/ru_us_matt_diphone.scm '(voice_ru_us_matt_diphone)'

12. A group file must be built that contains only the bits needed from the larger wave files.

festival (us_make_group_file 「group/mattlpc.group」 nil)

13.切換至Festival的英語聲音目錄(若是是其餘語言,可在voices目錄下建立新目錄。注意,voices文件夾下的us指的是unit selection,與美式英語沒有任何關係)

cd /your/festival/directory/lib/voices/english/

爲咱們的聲音庫添加一個符號連接:

ln -s /path/to/your/voice/ru_us_matt_diphone

14.再次啓動Festival,鍵入以下命令:

(voice_ru_us_matt_diphone)

是否是能夠使用咱們的聲音庫了?

參考資料

  1. Creating a Voice for Festival Speech Synthesis System

  2. Mailing list of the EMU Speech Database System

  3. Build Synthesis Voice 2.1

相關文章
相關標籤/搜索