參考 kaldi 的所有資料_v0.4html
能夠很清楚的看到有 3 個分類分別對應 a,b,c。a 和 b 都是集羣上去運行這個樣子, cgit
就是咱們須要的。咱們在虛擬機上運行的。你須要修改這個腳本數據庫
# "queue.pl" uses qsub. The options to it are # options to qsub. If you have GridEngine installed, # change this to a queue you have access to. # Otherwise, use "run.pl", which will run jobs locally # (make sure your --num-jobs options are no more than # the number of cpus on your machine. #a) JHU cluster options #export train_cmd="queue.pl -l arch=*64" #export decode_cmd="queue.pl -l arch=*64,mem_free=2G,ram_free=2G" #export mkgraph_cmd="queue.pl -l arch=*64,ram_free=4G,mem_free=4G" #export cuda_cmd=run.pl #b) BUT cluster options #export train_cmd="queue.pl -q all.q@@blade -l ram_free=1200M,mem_free=1200M" #export decode_cmd="queue.pl -q all.q@@blade -l ram_free=1700M,mem_free=1700M" #export decodebig_cmd="queue.pl -q all.q@@blade -l ram_free=4G,mem_free=4G" #export cuda_cmd="queue.pl -q long.q@@pco203 -l gpu=1" #export cuda_cmd="queue.pl -q long.q@pcspeech-gpu" #export mkgraph_cmd="queue.pl -q all.q@@servers -l ram_free=4G,mem_free=4G" #c) run it locally... export train_cmd=run.pl export decode_cmd=run.pl export cuda_cmd=run.pl export mkgraph_cmd=run.pl
在這裏通常只要修改 export KALDI_ROOT=`pwd`/../../..改成你安裝 kaldi 的目錄,有時候不bash
修改也能夠,根據實際狀況。app
export KALDI_ROOT=`pwd`/../../.. [ -f $KALDI_ROOT/tools/env.sh ] && . $KALDI_ROOT/tools/env.sh export PATH=$PWD/utils/:$KALDI_ROOT/tools/openfst/bin:$KALDI_ROOT/tools/irstlm/bin/:$PWD:$PATH [ ! -f $KALDI_ROOT/tools/config/common_path.sh ] && echo >&2 "The standard file $KALDI_ROOT/tools/config/common_path.sh is not present -> Exit!" && exit 1 . $KALDI_ROOT/tools/config/common_path.sh export LC_ALL=C
須要指定你的數據在什麼路徑下,你只須要修改:
如:框架
#timit=/export/corpora5/LDC/LDC93S1/timit/TIMIT # @JHU
timit=/mnt/matylda2/data/TIMIT/timit # @BUT
修改成你的 timit 所在的路徑。
其餘的數據庫都同樣。
此外,voxforge 或者 vystadial_cz 或者 vystadial_en 這些數據庫都提供下載,沒有數據庫的可
以利用這些來作實驗。
最後,來解釋下 run.sh 腳本。咱們就用 timit 裏的 s5 來舉例闡述:ide
位置: /home/dahu/myfile/my_git/kaldi/egs/timit/s5學習
#!/bin/bash # # Copyright 2013 Bagher BabaAli, # 2014-2017 Brno University of Technology (Author: Karel Vesely) # # TIMIT, description of the database: # http://perso.limsi.fr/lamel/TIMIT_NISTIR4930.pdf # # Hon and Lee paper on TIMIT, 1988, introduces mapping to 48 training phonemes, # then re-mapping to 39 phonemes for scoring: # http://repository.cmu.edu/cgi/viewcontent.cgi?article=2768&context=compsci # . ./cmd.sh [ -f path.sh ] && . ./path.sh #最好看看path.sh 的路徑是否有問題 set -e # Acoustic model parameters ,聲學模型的參數,暫時先不改 numLeavesTri1=2500 numGaussTri1=15000 numLeavesMLLT=2500 numGaussMLLT=15000 numLeavesSAT=2500 numGaussSAT=15000 numGaussUBM=400 numLeavesSGMM=7000 numGaussSGMM=9000 feats_nj=10 train_nj=30 decode_nj=5 #nj是指須要運行jobs的數量,通常不超過cpu的數量 echo ============================================================================ echo " Data & Lexicon & Language Preparation " echo ============================================================================ #timit=/export/corpora5/LDC/LDC93S1/timit/TIMIT # @JHU timit=/mnt/matylda2/data/TIMIT/timit # @BUT #修改成本身的timit所在路徑 local/timit_data_prep.sh $timit || exit 1 local/timit_prepare_dict.sh # Caution below: we remove optional silence by setting "--sil-prob 0.0", # in TIMIT the silence appears also as a word in the dictionary and is scored. utils/prepare_lang.sh --sil-prob 0.0 --position-dependent-phones false --num-sil-states 3 \ data/local/dict "sil" data/local/lang_tmp data/lang local/timit_format_data.sh echo ============================================================================ echo " MFCC Feature Extration & CMVN for Training and Test set " echo ============================================================================ # Now make MFCC features. #這部分主要是特徵提取部分, mfccdir=mfcc for x in train dev test; do steps/make_mfcc.sh --cmd "$train_cmd" --nj $feats_nj data/$x exp/make_mfcc/$x $mfccdir steps/compute_cmvn_stats.sh data/$x exp/make_mfcc/$x $mfccdir done echo ============================================================================ echo " MonoPhone Training & Decoding " echo ============================================================================ #這裏是單音素的訓練和解碼部分,語音識別最基礎的部分!!要詳細研究一下。 steps/train_mono.sh --nj "$train_nj" --cmd "$train_cmd" data/train data/lang exp/mono utils/mkgraph.sh data/lang_test_bg exp/mono exp/mono/graph steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/mono/graph data/dev exp/mono/decode_dev steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/mono/graph data/test exp/mono/decode_test echo ============================================================================ echo " tri1 : Deltas + Delta-Deltas Training & Decoding " echo ============================================================================ #這裏是三音素的訓練和解碼部分 steps/align_si.sh --boost-silence 1.25 --nj "$train_nj" --cmd "$train_cmd" \ data/train data/lang exp/mono exp/mono_ali # Train tri1, which is deltas + delta-deltas, on train data. steps/train_deltas.sh --cmd "$train_cmd" \ $numLeavesTri1 $numGaussTri1 data/train data/lang exp/mono_ali exp/tri1 utils/mkgraph.sh data/lang_test_bg exp/tri1 exp/tri1/graph steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri1/graph data/dev exp/tri1/decode_dev steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri1/graph data/test exp/tri1/decode_test echo ============================================================================ echo " tri2 : LDA + MLLT Training & Decoding " echo ============================================================================ #這裏在三音素模型的基礎上作了 LDA + MLLT 變換 steps/align_si.sh --nj "$train_nj" --cmd "$train_cmd" \ data/train data/lang exp/tri1 exp/tri1_ali steps/train_lda_mllt.sh --cmd "$train_cmd" \ --splice-opts "--left-context=3 --right-context=3" \ $numLeavesMLLT $numGaussMLLT data/train data/lang exp/tri1_ali exp/tri2 utils/mkgraph.sh data/lang_test_bg exp/tri2 exp/tri2/graph steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri2/graph data/dev exp/tri2/decode_dev steps/decode.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri2/graph data/test exp/tri2/decode_test echo ============================================================================ echo " tri3 : LDA + MLLT + SAT Training & Decoding " echo ============================================================================ #這裏是三音素模型的基礎上作了 LDA + MLLT + SAT 變換 # Align tri2 system with train data. steps/align_si.sh --nj "$train_nj" --cmd "$train_cmd" \ --use-graphs true data/train data/lang exp/tri2 exp/tri2_ali # From tri2 system, train tri3 which is LDA + MLLT + SAT. steps/train_sat.sh --cmd "$train_cmd" \ $numLeavesSAT $numGaussSAT data/train data/lang exp/tri2_ali exp/tri3 utils/mkgraph.sh data/lang_test_bg exp/tri3 exp/tri3/graph steps/decode_fmllr.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri3/graph data/dev exp/tri3/decode_dev steps/decode_fmllr.sh --nj "$decode_nj" --cmd "$decode_cmd" \ exp/tri3/graph data/test exp/tri3/decode_test echo ============================================================================ echo " SGMM2 Training & Decoding " echo ============================================================================ #這裏是三音素模型的基礎上作了 sgmm2 steps/align_fmllr.sh --nj "$train_nj" --cmd "$train_cmd" \ data/train data/lang exp/tri3 exp/tri3_ali exit 0 # From this point you can run Karel's DNN : local/nnet/run_dnn.sh steps/train_ubm.sh --cmd "$train_cmd" \ $numGaussUBM data/train data/lang exp/tri3_ali exp/ubm4 steps/train_sgmm2.sh --cmd "$train_cmd" $numLeavesSGMM $numGaussSGMM \ data/train data/lang exp/tri3_ali exp/ubm4/final.ubm exp/sgmm2_4 utils/mkgraph.sh data/lang_test_bg exp/sgmm2_4 exp/sgmm2_4/graph steps/decode_sgmm2.sh --nj "$decode_nj" --cmd "$decode_cmd"\ --transform-dir exp/tri3/decode_dev exp/sgmm2_4/graph data/dev \ exp/sgmm2_4/decode_dev steps/decode_sgmm2.sh --nj "$decode_nj" --cmd "$decode_cmd"\ --transform-dir exp/tri3/decode_test exp/sgmm2_4/graph data/test \ exp/sgmm2_4/decode_test echo ============================================================================ echo " MMI + SGMM2 Training & Decoding " echo ============================================================================ #這裏是三音素模型的基礎上作了 MMI + SGMM2 steps/align_sgmm2.sh --nj "$train_nj" --cmd "$train_cmd" \ --transform-dir exp/tri3_ali --use-graphs true --use-gselect true \ data/train data/lang exp/sgmm2_4 exp/sgmm2_4_ali steps/make_denlats_sgmm2.sh --nj "$train_nj" --sub-split "$train_nj" \ --acwt 0.2 --lattice-beam 10.0 --beam 18.0 \ --cmd "$decode_cmd" --transform-dir exp/tri3_ali \ data/train data/lang exp/sgmm2_4_ali exp/sgmm2_4_denlats steps/train_mmi_sgmm2.sh --acwt 0.2 --cmd "$decode_cmd" \ --transform-dir exp/tri3_ali --boost 0.1 --drop-frames true \ data/train data/lang exp/sgmm2_4_ali exp/sgmm2_4_denlats exp/sgmm2_4_mmi_b0.1 for iter in 1 2 3 4; do steps/decode_sgmm2_rescore.sh --cmd "$decode_cmd" --iter $iter \ --transform-dir exp/tri3/decode_dev data/lang_test_bg data/dev \ exp/sgmm2_4/decode_dev exp/sgmm2_4_mmi_b0.1/decode_dev_it$iter steps/decode_sgmm2_rescore.sh --cmd "$decode_cmd" --iter $iter \ --transform-dir exp/tri3/decode_test data/lang_test_bg data/test \ exp/sgmm2_4/decode_test exp/sgmm2_4_mmi_b0.1/decode_test_it$iter done echo ============================================================================ echo " DNN Hybrid Training & Decoding " echo ============================================================================ #這裏是povey版本的dnn模型,教程說不建議使用 # DNN hybrid system training parameters dnn_mem_reqs="--mem 1G" dnn_extra_opts="--num_epochs 20 --num-epochs-extra 10 --add-layers-period 1 --shrink-interval 3" steps/nnet2/train_tanh.sh --mix-up 5000 --initial-learning-rate 0.015 \ --final-learning-rate 0.002 --num-hidden-layers 2 \ --num-jobs-nnet "$train_nj" --cmd "$train_cmd" "${dnn_train_extra_opts[@]}" \ data/train data/lang exp/tri3_ali exp/tri4_nnet [ ! -d exp/tri4_nnet/decode_dev ] && mkdir -p exp/tri4_nnet/decode_dev decode_extra_opts=(--num-threads 6) steps/nnet2/decode.sh --cmd "$decode_cmd" --nj "$decode_nj" "${decode_extra_opts[@]}" \ --transform-dir exp/tri3/decode_dev exp/tri3/graph data/dev \ exp/tri4_nnet/decode_dev | tee exp/tri4_nnet/decode_dev/decode.log [ ! -d exp/tri4_nnet/decode_test ] && mkdir -p exp/tri4_nnet/decode_test steps/nnet2/decode.sh --cmd "$decode_cmd" --nj "$decode_nj" "${decode_extra_opts[@]}" \ --transform-dir exp/tri3/decode_test exp/tri3/graph data/test \ exp/tri4_nnet/decode_test | tee exp/tri4_nnet/decode_test/decode.log echo ============================================================================ echo " System Combination (DNN+SGMM) " echo ============================================================================ #這裏是 dnn + sgmm 模型 for iter in 1 2 3 4; do local/score_combine.sh --cmd "$decode_cmd" \ data/dev data/lang_test_bg exp/tri4_nnet/decode_dev \ exp/sgmm2_4_mmi_b0.1/decode_dev_it$iter exp/combine_2/decode_dev_it$iter local/score_combine.sh --cmd "$decode_cmd" \ data/test data/lang_test_bg exp/tri4_nnet/decode_test \ exp/sgmm2_4_mmi_b0.1/decode_test_it$iter exp/combine_2/decode_test_it$iter done echo ============================================================================ echo " DNN Hybrid Training & Decoding (Karel's recipe) " echo ============================================================================ #這裏是 karel 的 dnn 模型,通用的深度學習模型框架!! local/nnet/run_dnn.sh #local/nnet/run_autoencoder.sh : an example, not used to build any system, echo ============================================================================ echo " Getting Results [see RESULTS file] " echo ============================================================================ #這裏是獲得上述模型的最後識別結果 bash RESULTS dev bash RESULTS test echo ============================================================================ echo "Finished successfully on" `date` echo ============================================================================ exit 0
看完這3個基本的腳本,瞭解下大概都是作什麼用的,正在下載 timit 的data,以後跑一下。ui
timit 數據集下載: kaldi timit 實例運行全過程this