
所用版本:EST 2.1, Festival 2.1,  Festvox2.4(該版本兼容Festival2.1,官方發佈的Festvox2.1版本可能存在兼容問題)html

0. Before continuing make sure that you have Speech tools, Festival, FestVox and EMU Label

correctly installed. Also make sure the following environment variable are set.

export ESTDIR=/home/...../speech_tools

export FESTVOXDIR=/home/...../FestVox

export PATH = $PATH:/home/...../Festival/bin

1. Make a directory and change to it. By convention the directory is named Insitution_language_name_type e.g.

mkdir ~/ru_us_matt_diphone

cd ~/ru_us_matt)diphone

2. FestVox provides a tool to build the basic directory structure. It takes institution, language and

name as arguments. E.g.

$FESTVOXDIR/src/diphones/setup_diphone ru us matt

The setup script also needs to copy in some language specific files. For US English the following

3. The nonsense word list must be generated.

festival -b festvox/diphlist.scm festvox/us_schema.scm '(diphone-gen-schema "us" "etc/usdiph.list")'

4. The prompts must be synthesised so that Festival can prompt the user before recording the diphones.

festival -b festvox/diphlist.scm festvox/us_schema.scm '(diphone-gen-waves "prompt-wav" "prompt-lab" "etc/usdiph.list")'



bin/prompt_them etc/usdiph.list

注意,若是啓動festival須要使用命令 padsp festival的話,這裏的命令也必須寫成

padsp bin/prompt_them etc/usdiph.list


終端會提示將要錄下的音節,好比pau t aa t ae t aa pau,而後會輸出合成的語音,接下來提示開始錄音兩秒,而後對着麥克風說完這個音節串便可。

6.全部錄音完成以後就能夠使用自動標註(auto label)的腳本程序對錄音片斷自動標註了:

bin/make_labs prompt-wav/*.wav

7.若是須要手動修改標註,那麼官方文檔提供的工具emulabel已通過時了,如今能夠使用wavesurfer這個工具,若是是ubuntu系統能夠直接在源裏安裝(sudo apt-get install wavesurfer)。若是要修改pitchmark則還須要Wavesurfer Pitchmark Plugin

插件,目前能夠在http://mh21.de/pmedit/index.html下載(若是直接打開連接會404,那就從google搜索頁中打開吧)。把他下載到~/.wavesurfer/1.8/plugins裏面便可。根據郵件列表中一個朋友的描述,將wav/lab/mcep/pm/目錄中的文件所有拷貝到一個單獨的目錄中(此時個人mcep/pm/目錄是空的,見後)。而後使用wavesurfer打開一個錄音文件。此時提示使用什麼配置,選擇transcription。點擊右鍵->create pane ->Pitchmarks

注意,此時可能由於缺乏pitchmark文件(pm/)因此面板顯示一片空白。Festvox提供pitchmarklabel files互轉的腳本make_pm_pmlabmake_pmlab_pm。使用make_pmlab_pm腳本將lab/下的label文件轉換爲pm文件而後拷貝到上面說到的同一個目錄下便可:

bin/make_pmlab_pm lab/*.lab


mkdir dic

bin/make_diph_index etc/usdiph.list dic/mattdiph.est


9.下一步是pitchmark的提取, and then moving it to the nearest peak.

不過首先須要拷貝etc/usdiph.list etc/txt.done.data,這是將要執行的兩個腳本中的小bug

cp etc/usdiph.list etc/txt.done.data

bin/make_pm_wave wav/*.wav

bin/make_pm_fix pm/*.pm


10. You can optionally match the power, first the files must be analysed and a mean factor extracted

bin/find_powerfactors lab/*.lab (終於遇到個能夠執行的命令了……)

And finally you can use this to build the pitch-synchronous LPC coefficients

bin/make_lpc wav/*.wav


padsp festival festvox/ru_us_matt_diphone.scm '(voice_ru_us_matt_diphone)'

12. A group file must be built that contains only the bits needed from the larger wave files.

festival (us_make_group_file 「group/mattlpc.group」 nil)

13.切換至Festival的英語聲音目錄(若是是其餘語言,可在voices目錄下建立新目錄。注意,voices文件夾下的us指的是unit selection,與美式英語沒有任何關係)

cd /your/festival/directory/lib/voices/english/


ln -s /path/to/your/voice/ru_us_matt_diphone





