Moses manual 中Basline System 2.3.4節用IRSTLM建立語言模型的命令有誤

時間 2019-12-01

標籤 moses manual basline system 2.3.4 節用 irstlm 建立語言模型命令简体版

原文原文鏈接

手冊裏寫到：dom

 ~/irstlm/bin/compile-lm  \
   --text yes \
   news-commentary-v8.fr-en.lm.en.gz \
   news-commentary-v8.fr-en.arpa.en

通過查閱compile-lm的幫助裏寫到：ide

compile-lm - compiles an ARPA format LM into an IRSTLM format one

USAGE:
       compile-lm [options] <input-file.lm> [output-file.blm]

DESCRIPTION:
       compile-lm reads a standard LM file in ARPA format and produces
       a compiled representation that the IRST LM toolkit can quickly
       read and process. LM file can be compressed.

OPTIONS:
Parameters:
    Help:      print this help
    d:      verbose output for --eval option; default is 0
    debug:      verbose output for --eval option; default is 0
    dict_load_factor:      sets the load factor for ngram cache; it should be a positive real value; default is 0
    dub:      dictionary upperbound to compute OOV word penalty: default 10^7
    e:      computes perplexity of the specified text file
    eval:      computes perplexity of the specified text file
    f:      filter a binary language model with a word list
    filter:      filter a binary language model with a word list
    h:      print this help
    i:      builds an inverted n-gram binary table for fast access; default if false
    invert:      builds an inverted n-gram binary table for fast access; default if false
    keepunigrams:      filter by keeping all unigrams in the table, default  is true
    ku:      filter by keeping all unigrams in the table, default  is true
    l:      maximum level to load from the LM; if value is larger than the actual LM order, the latter is taken
    level:      maximum level to load from the LM; if value is larger than the actual LM order, the latter is taken
    memmap:      uses memory map to read a binary LM
    mm:      uses memory map to read a binary LM
    ngram_load_factor:      sets the load factor for ngram cache; it should be a positive real value; default is false
    r:      computes N random calls on the specified text file
    randcalls:      computes N random calls on the specified text file
    s:      computes log-prob scores of n-grams from standard input
    score:      computes log-prob scores of n-grams from standard input
    sentence:      computes perplexity at sentence level (identified through the end symbol)
    t:      output is again in text format; default is false
    text:      output is again in text format; default is false
    tmpdir:      directory for temporary computation, default is either the environment variable TMP if defined or "/tmp")