Kaldi的nnet2 Component

時間 2019-11-10

標籤 kaldi nnet2 nnet component 简体版

原文原文鏈接

FixedAffineComponent：類 LDA-like 的非相關轉換，由標準的 weight matrix plus bias 組成（即Wx+b），經過標準的 stochastic gradient descent（非minibatch SGD？）訓練而來，使用 global learning rate網絡

AffineComponentPreconditionedOnline：爲 FixedAffineComponent 的一種提煉，訓練過程當中不只使用global learning rate，還使用 matrix-valued learning rate（矩陣形式的學習率）來預處理梯度降低。參見 dnn2_preconditioning。app

PnormComponent：爲非線性，傳統的神經網絡模型中使用 TanhComponentiphone

NormalizeComponent：用於穩定訓練 p-norm 網絡，它是固定的，非可訓練，非線性的。它不是在個別 individual activations（即個別結點的激活）上起做用，而是對單幀的整個 vector 起做用，從新使它們單位標準化。ide

SoftmaxComponent：爲最終的非線性特徵，便於輸出標準機率學習

SpliceComponent: 定義了完成 feature-frame-splicing 的窗口尺寸spa

FixedAffineComponent：類 LDA-like 的非相關轉換，由標準的 weight matrix plus bias 組成（即Wx+b），經過標準的 stochastic gradient descent（非minibatch SGD？）訓練而來，使用 global learning ratecomponent

PnormComponent：爲非線性，傳統的神經網絡模型中使用 TanhComponentip

NormalizeComponent：用於穩定訓練 p-norm 網絡，它是固定的，非可訓練，非線性的。它不是在個別 individual activations（即個別結點的激活）上起做用，而是對單幀的整個 vector 起做用，從新使它們單位標準化。ci

SoftmaxComponent：爲最終的非線性特徵，便於輸出標準機率

SigmoidComponent
- dim
TanhComponent
- dim
PowerComponent
- power
- dim
- input-dim
SoftmaxComponent
- 與目標訓練集無關。
- dim
LogSoftmaxComponent
- dim
RectifiedLinearComponent
- dim
NormalizeComponent
- 歸一化層，對輸入進行歸一化。網絡訓練過程當中，輸入特徵是一個mini-batch,即包含多個特徵向量的矩陣。歸一化層會對這個mini-batch進行歸一化。該組員只有一個參數，與目標訓練集無關。
- 用於穩定訓練 p-norm 網絡，它是固定的，非可訓練，非線性的。它不是在個別 individual activations（即個別結點的激活）上起做用，而是對單幀的整個 vector 起做用，從新使它們單位標準化。
- dim
  - 輸入特徵維度
SoftHingeComponent
- dim
PnormComponent
- 該組員只有3個參數，輸入輸出位數依賴於上下層，參數p是固定的，與目標訓練集無關。
- 爲非線性，傳統的神經網絡模型中使用 TanhComponent
- output-dim
  - 輸出維數通常是輸入維數是十分之一，如：
    
    pnorm_input_dim=3000
    
    pnorm_output_dim=300
- input-dim
- p
MaxoutComponent
- output-dim
- input-dim
ScaleComponent
- dim
- scale
AffineComponent
- learning-rate // optional.
- 若是指定了matrix ，則從matrix 中讀取仿射變換
  - matrix
  - input-dim // optional.必須與matrix匹配
  - output-dim // optional.必須與matrix匹配
- 若是沒有指定matrix ，則新建一個仿射變換
  - param-stddev
    - parameter standard deviation，權值的標準差
    - 將參數的標註差限制在一個範圍內，防止參數變化過大，該方法有利於防止over-fitting
  - bias-stddev
    - bias standard deviation，偏置的標準差
    - 將偏置的標準差限制在一個範圍內，防止偏置變化過大，該方法有利於防止over-fitting
  - input-dim
  - output-dim
AffineComponentPreconditioned
- learning-rate // optional.
- alpha //Precondition
- max-change //Precondition
- 若是指定了matrix ，則從matrix 中讀取仿射變換
  - matrix
  - input-dim // optional.必須與matrix匹配
  - output-dim // optional.必須與matrix匹配
- 若是沒有指定matrix ，則新建一個仿射變換
  - param-stddev
    - parameter standard deviation，權值的標準差
    - 將參數的標註差限制在一個範圍內，防止參數變化過大，該方法有利於防止over-fitting
  - bias-stddev
    - bias standard deviation，偏置的標準差
    - 將偏置的標準差限制在一個範圍內，防止偏置變化過大，該方法有利於防止over-fitting
  - input-dim
  - output-dim
AffineComponentPreconditionedOnline
- 全鏈接層的權重參數層，在kaldi的表示中，一層網絡被拆分紅權重層和後面的非線性變換層，其中權重層保存了網絡的鏈接參數W，這些參數是能夠改變的，然後面的非線性變換層（以下面的SoftmaxComponent）是固定的。
- 爲 FixedAffineComponent 的一種提煉，訓練過程當中不只使用global learning rate，還使用 matrix-valued learning rate（矩陣形式的學習率）來預處理梯度降低。參見 dnn2_preconditioning
- learning-rate // optional.
- num-samples-history
- alpha //Precondition
- max-change-per-sample //Precondition
- rank-in //Online
- rank-out //Online
- update-period //Online
- 若是指定了matrix ，則從matrix 中讀取仿射變換
  - matrix
  - input-dim // optional.必須與matrix匹配
  - output-dim // optional.必須與matrix匹配
- 若是沒有指定matrix ，則新建一個仿射變換
  - param-stddev
    - parameter standard deviation，權值的標準差
    - 將參數的標註差限制在一個範圍內，防止參數變化過大，該方法有利於防止over-fitting
  - bias-stddev
    - bias standard deviation，偏置的標準差
    - 將偏置的標準差限制在一個範圍內，防止偏置變化過大，該方法有利於防止over-fitting
  - input-dim
  - output-dim
SumGroupComponent
- sizes
BlockAffineComponent
- learning-rate // optional.
- input-dim
- output-dim
- num-blocks
- param-stddev
  - parameter standard deviation，權值的標準差
  - 將參數的標註差限制在一個範圍內，防止參數變化過大，該方法有利於防止over-fitting
- bias-stddev
  - bias standard deviation，偏置的標準差
  - 將偏置的標準差限制在一個範圍內，防止偏置變化過大，該方法有利於防止over-fitting
BlockAffineComponentPreconditioned
- learning-rate // optional.
- alpha //Precondition
- input-dim
- output-dim
- num-blocks
- param-stddev
  - parameter standard deviation，權值的標準差
  - 將參數的標註差限制在一個範圍內，防止參數變化過大，該方法有利於防止over-fitting
- bias-stddev
  - bias standard deviation，偏置的標準差
  - 將偏置的標準差限制在一個範圍內，防止偏置變化過大，該方法有利於防止over-fitting
PermuteComponent
- dim
DctComponent
- dim
- dct-dim
- reorder
- dct-keep-dim
FixedLinearComponent
- matrix
FixedAffineComponent
- 類 LDA-like 的非相關轉換，由標準的 weight matrix plus bias 組成（即Wx+b），經過標準的 stochastic gradient descent（非minibatch SGD？）訓練而來，使用 global learning rate
- matrix
FixedScaleComponent
- 固定激活重調組員
- 該組員位於SoftmaxComponent以前，維數與SoftmaxComponent相同，都是Senone的個數，該組員的參數是一個先驗機率向量，其中第i個元素是第i個Senone在全部對齊（$alidir/ali.*.gz）中出現的機率（Senone i出現次數/全部Senone全部出現次數）
- scales，先驗機率參數，須要從對齊（$alidir/ali.*.gz）和模型（$alidir/final.mdl）中獲取
FixedBiasComponent
- bias
SpliceComponent
- 對輸入特徵進行左右展開，目的是爲了讓網絡可以獲取到幀間特徵的關聯性。例如我要識別當前幀是哪一個triphone，我能夠將當前幀以前5幀和當前幀之後5幀一塊兒構成一個由11個幀組成的特徵做爲網絡輸入。
- 定義了完成 feature-frame-splicing 的窗口尺寸
- input-dim
- context
- left-context
- right-context
- const-component-dim = 0
SpliceMaxComponent
- dim
- context
- left-context
- right-context
DropoutComponent
- dim
- dropout-proportion
- dropout-scale
AdditiveNoiseComponent
- dim
- stddev
Convolutional1dComponent
- learning-rate
- appended-conv
- patch-dim
  - 卷積核的大小（維度）
- patch-step = 1
  - 卷積核的每次步進大小，若大於patch-dim，則卷積運算沒有重疊部分。
- patch-stride
  - 卷積層會將輸入向量特徵轉換成二維矩陣（相似於圖像）進行卷積，該值肯定了二維矩陣的行數，同時，卷積核也受該值的影響。
  - 以kaldi提供核心代碼爲例：
  - 第一個卷積層輸入是一個36*3*11的一維特徵向量，令該值等於fbank不包含差分特徵的維度（即36），則輸入特徵向量可轉換成一個36*33的特徵矩陣，再利用卷積核（7*33）進行卷積。
  - 第二個卷積層的輸入是池化層的輸出，令該值等於輸入的維度，則轉換成的特徵矩陣仍然是原來的向量。
- 若是指定了matrix ，則從matrix 中讀取仿射變換
  - matrix
  - input-dim
    - optional.必須與matrix匹配
  - output-dim
    - optional.必須與matrix匹配
- 若是沒有指定matrix ，則新建一個仿射變換
  - param-stddev
    - parameter standard deviation，權值的標準差
    - 將參數的標註差限制在一個範圍內，防止參數變化過大，該方法有利於防止over-fitting
  - bias-stddev
    - bias standard deviation，偏置的標準差
    - 將偏置的標準差限制在一個範圍內，防止偏置變化過大，該方法有利於防止over-fitting
  - input-dim
  - output-dim
- // propagation function
- /*
- In Convolution1dComponent, filter is defined $num-filters x $filter-dim,
- and bias vector B is defined by length $num-filters. The propatation is
- Y = X o A' + B
- where "o" is executing matrix-matrix convolution, which consists of a group
- of vector-matrix convolutions.
- For instance, the convolution of X(t) and the i-th filter A(i) is
- Y(t,i) = X(t) o A'(i) + B(i)
- The convolution used here is valid convolution. Meaning that the
- output of M o N is of dim |M| - |N| + 1, assuming M is not shorter then N.
- By default, input is arranged by
- x (time), y (channel), z(frequency)
- and output is arranged by
- x (time), y (frequency), z(channel).
- When appending convolutional1dcomponent, appended_conv_ should be
- set ture for the appended convolutional1dcomponent.
- */
MaxpoolingComponent
- 池化層Component，該層會對卷積的特徵進行最大化池化，即在一個範圍內（池化面積）從同一個卷積核的輸出選取最大的一個做爲下一層的輸入，池化核不重疊。池化的好處除了可以降維之外，更重要的一點是可以去除輸入特徵中的一些擾動。
- input-dim
- output-dim
- pool-size
  - 池化面積
- pool-stride
  - 池化範圍，此處與卷積層相同，會將向量轉換成矩陣進行處理。
- /*
- Input and output of maxpooling component is arranged as
- x (time), y (frequency), z (channel)
- for efficient pooling.
- */

相關標籤/搜索

5g+ai+application+object+gameobject+component

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。