Kaldi是基於C++開發並遵照Apache License v2.0的一款語音識別工具包,是目前最流行的ASR工具之一,本文基於Ubuntu 18.04 LTS介紹瞭如何安裝Kaldi。html
首先按照官網提示,將Kaldi項目克隆至本地:linux
~$ git clone https://github.com/kaldi-asr/kaldi.git kaldi-trunk --origin golden
進入kaldi-trunk:git
~$ cd kaldi-trunk ~/kaldi-trunk$
查看INSTALL:github
~/kaldi-trunk$ cat INSTALL This is the official Kaldi INSTALL. Look also at INSTALL.md for the git mirror installation. [for native Windows install, see windows/INSTALL] (1) go to tools/ and follow INSTALL instructions there. (2) go to src/ and follow INSTALL instructions there.
因此先進入tools目錄按提示安裝,再進入src目錄按提示安裝。shell
進入tools目錄查看INSTALL:windows
~/kaldi-trunk$ cd tools ~/kaldi-trunk/tools$ cat INSTALL To check the prerequisites for Kaldi, first run extras/check_dependencies.sh and see if there are any system-level installations you need to do. Check the output carefully. There are some things that will make your life a lot easier if you fix them at this stage. If your system default C++ compiler is not supported, you can do the check with another compiler by setting the CXX environment variable, e.g. CXX=g++-4.8 extras/check_dependencies.sh Then run make which by default will install ATLAS headers, OpenFst, SCTK and sph2pipe. OpenFst requires a relatively recent C++ compiler with C++11 support, e.g. g++ >= 4.7, Apple clang >= 5.0 or LLVM clang >= 3.3. If your system default compiler does not have adequate support for C++11, you can specify a C++11 compliant compiler as a command argument, e.g. make CXX=g++-4.8 If you have multiple CPUs and want to speed things up, you can do a parallel build by supplying the "-j" option to make, e.g. to use 4 CPUs make -j 4 In extras/, there are also various scripts to install extra bits and pieces that are used by individual example scripts. If an example script needs you to run one of those scripts, it will tell you what to do.
因此首先須要進入extras目錄運行腳本check_dependencies.sh來檢查各類依賴是否安裝。工具
進入extras並運行check_dependencies.sh:ui
~/kaldi-trunk/tools$ cd extras/ ~/kaldi-trunk/tools/extras$ ./check_dependencies.sh ./check_dependencies.sh: all OK.
運行check_dependencies.sh後出現任何提示代表某些庫未安裝,都應按照提示解決,直到運行check_dependencies.sh後出現如上所示」./check_dependencies.sh: all OK.」。this
而後進入上一級,進行編譯:spa
~/kaldi-trunk/tools/extras$ cd .. ~/kaldi-trunk/tools$ make
若是是在虛擬機上,建議使用make而非make -j 4,不然很容易內存不夠致使編譯失敗,以後在src目錄下的編譯也同樣。
make完成後可能會提示irstlm未安裝,此時不用管,先繼續完成整個kaldi的安裝再說。
進入src目錄並查看INSTALL:
~/kaldi-trunk/tools$ cd ../src ~/kaldi-trunk/src$ cat INSTALL These instructions are valid for UNIX-like systems (these steps have been run on various Linux distributions; Darwin; Cygwin). For native Windows compilation, see ../windows/INSTALL. You must first have completed the installation steps in ../tools/INSTALL (compiling OpenFst; getting ATLAS and CLAPACK headers). The installation instructions are ./configure --shared make depend -j 8 make -j 8 Note that we added the "-j 8" to run in parallel because "make" takes a long time. 8 jobs might be too many for a laptop or small desktop machine with not many cores. For more information, see documentation at http://kaldi-asr.org/doc/ and click on "The build process (how Kaldi is compiled)".
運行configure且不要添加參數」– –shared」:
~/kaldi-trunk/src$ ./configure Configuring ... Backing up kaldi.mk to kaldi.mk.bak ... Checking compiler g++ ... Checking OpenFst library in /home/zillyrex/kaldi-trunk/tools/openfst ... Doing OS specific configurations ... On Linux: Checking for linear algebra header files ... Using ATLAS as the linear algebra library. Atlas found in /usr/lib/x86_64-linux-gnu Validating presence of ATLAS libs in /usr/lib/x86_64-linux-gnu Using library /usr/lib/x86_64-linux-gnu/liblapack.so as ATLAS's CLAPACK library. CUDA will not be used! If you have already installed cuda drivers and cuda toolkit, try using --cudatk-dir=... option. Note: this is only relevant for neural net experiments Info: configuring Kaldi not to link with Speex (don't worry, it's only needed if you intend to use 'compress-uncompress-speex', which is very unlikely) Successfully configured for Linux [dynamic libraries] with ATLASLIBS =/usr/lib/x86_64-linux-gnu/liblapack.so /usr/lib/x86_64-linux-gnu/libcblas.so /usr/lib/x86_64-linux-gnu/libatlas.so /usr/lib/x86_64-linux-gnu/libf77blas.so SUCCESS To compile: make clean -j; make depend -j; make -j ... or e.g. -j 10, instead of -j, to use a specified number of CPUs
務必仔細閱讀運行configure後顯示的提示,它可能和上文所示的內容有所區別,其中提醒了你有哪些東西沒安裝好,並給出了指導,遵循那些執導完成相關依賴的安裝,直到運行configure後出現如上文所示的提示,提示的最後顯示」SUCCESS To compile: ……」,此時才能進行後面的步驟,不然長時間的make後會報錯。
執行最後的步驟,編譯kaldi的源碼:
~/kaldi-trunk/src$ make depend ... ... ~/kaldi-trunk/src$ make ... ... ... Done
make的時間較長,大約半個小時到一個小時,若是編譯過程當中未出現紅色的error,最後出現」Done」,代表編譯成功。
最後運行一個例程來檢驗安裝是否成功,運行egs/yesno/s5目錄下的run.sh:
~/kaldi-trunk/src$ cd ../egs/yesno/s5/ ~/kaldi-trunk/egs/yesno/s5$ ./run.sh Preparing train and test data Dictionary preparation succeeded utils/prepare_lang.sh --position-dependent-phones false data/local/dict <SIL> data/local/lang data/lang Checking data/local/dict/silence_phones.txt ... --> reading data/local/dict/silence_phones.txt --> text seems to be UTF-8 or ASCII, checking whitespaces --> text contains only allowed whitespaces --> data/local/dict/silence_phones.txt is OK ... ... ... local/score.sh: scoring with word insertion penalty=0.0,0.5,1.0 %WER 0.00 [ 0 / 232, 0 ins, 0 del, 0 sub ] exp/mono0a/decode_test_yesno/wer_10_0.0
出現如上結果,代表kaldi安裝成功。