AI => TF20的LSTM與GRU(return_sequences與return_state)參數源碼

時間 2019-12-20

標籤 tf20 lstm gru return sequences state 參數源碼简体版

原文原文鏈接

前言

舒適提示：

本文只適用於: 瞭解LSTM 和 GRU的結構，可是不懂Tensorflow20中LSTM和GRU的參數的人)

額外說明

看源碼不等於高大上。
當你各類博客翻爛，發現內容不是互相引用，就是相互"借鑑"。。。且絕望時。
你可能會翻翻文檔，其實有些文檔寫的並非很詳細。
這時，看源碼是你最好的理解方式。（LSTM 和 GRU 部分源碼仍是比較好看的）函數

標題寫不下了: TF20 ==> Tensorflow2.0（Stable）
tk ===> tensorflow.keras
LSTM 和 GRU 已經放在 tk.layers模塊中。code

return_sequences = True
return_state = True

這兩個參數是使用率最高的兩個了， 而且LSTM 和 GRU 中都有。
那它們到底是什麼意思呢???
來，開始吧！

進入源碼方式：
    import tensorflow.keras as tk
    tk.layers.GRU()
    tk.layers.LSTM()
    用pycharm ctrl+左鍵 點進源碼便可~~~

LSTM源碼

我截取了部分主幹源碼：文檔

...
...
  states = [new_h, new_c]           # 很顯然，第一個是橫向狀態h, 另外一個是記憶細胞c

if self.return_sequences:         # 若是return_sequences設爲True
  output = outputs                    # 則輸出值爲全部LSTM單元的 輸出y，注意還沒return
else:                             # 若是return_sequences設爲False
  output = last_output                # 則只輸出LSTM最後一個單元的信息, 注意還沒return

if self.return_state:             # 若是return_state設爲True
  return [output] + list(states)      # 則最終返回 上面的output + [new_h, new_c]
else:                             # 若是return_state設爲False
  return output                       # 則最終返回 只返回上面的output

小技巧: 瞄準 return 關鍵詞。 你就會很是清晰，它會返回什麼了。

GRU源碼

...
...
########  咱們主要看這一部分 #########################################
  last_output, outputs, runtime, states = self._defun_gru_call( 
      inputs, initial_state, training, mask)
#####################################################################          
...
...

######### 下面不用看了， 這下面代碼和  LSTM是如出一轍的 ###################
if self.return_sequences:
  output = outputs
else:
  output = last_output

if self.return_state:
  return [output] + list(states)
else:
  return output

如今咱們的尋找關鍵點只在於， states 是怎麼獲得的？？？
你繼續點進去 "self._defun_gru_call" 這個函數的源碼，你會發現 states 就直接暴露在裏面pycharm

states = [new_h]
return ..., states

如今源碼幾乎所有分析完畢。咱們回頭思考總結一下：input

LSTM 和 GRU 中的 return_sequences 和 return_state 部分的源碼是如出一轍的！！！
    return_sequences: 只管理 output變量的賦值，（最後一個單元 或 所有單元）
    return_state： 負責返回 output變量，而且按條件決定是否再一併多返回一個 states變量
    
進而咱們把問題關注點轉換到  output變量， 和 states變量：

LSTM 和 GRU 的 output變量: 大體類似，不用管。
LSTM 和 GRU 的 ststes變量：
    LSTM的 states變量:  [H, C]    # 若是你瞭解LSTM的結構，看到這裏你應該很清楚，LSTM有C和H
    GRU的 states變量:   [H]       # 若是你瞭解GRU的結構，看到這裏你應該很清楚，GRU就一個H

最終使用層總結：

LSTM:

有四種組合使用：源碼

return_sequences = False 且 return_state = False (默認)博客
```
返回值: 只返回 最後一個 LSTM單元的輸出Y
```
return_sequences = True 且 return_state = Falseit
```
返回值: 只返回 全部 LSTM單元的輸出Y
```

return_sequences = False 且 return_state = Trueio

返回值: 返回最後一個LSTM單元的輸出Y   和    C + H 兩個（隱層信息）

return_sequences = True 且 return_state = Truetable

返回值: 返回全部LSTM單元的輸出Y  和  C + H 兩個（隱層信息）  (適用於Atention)

GRU：

有四種組合使用：

return_sequences = False 且 return_state = False (默認)
```
返回值: 同LSTM
```
return_sequences = True 且 return_state = False
```
返回值: 同LSTM
```

return_sequences = False 且 return_state = True

返回值: 返回 最後一個 LSTM單元的輸出Y   和   一個H（隱層信息）

return_sequences = True 且 return_state = True

返回值: 返回 全部 LSTM單元的輸出Y  和 一個H（隱層信息）  (適用於Atention)

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。