day-18 滑動平均模型測試樣例

        爲了使訓練模型在測試數據上有更好的效果,能夠引入一種新的方法:滑動平均模型。經過維護一個影子變量,來代替最終訓練參數,進行訓練模型的驗證。python

 

      在tensorflow中提供了ExponentialMovingAverage來實行滑動平均模型,模型會維護一個影子變量,其計算公式爲:app

shadow_variable = decay * shadow_variable + (1 - decay) * variable函數

        當訓練模型時,維護訓練參數的滑動平均值是有好處的。相比較最終訓練值,驗證時使用滑動平均值有時能產生更好的結果。
apply()函數方法會添加一個影子拷貝到訓練變量中,而後在他們影子副本上維護訓練參數的滑動平均值計算操做。這個操做一般在一輪訓練完以後進行。
average()和average_name()函數提供了訪問影子變量和他們名字的方法。這在構建一個評估模型或者從checkpoint文件中重載模型時很是有用。在驗證時,能夠幫助使用滑動平均值替換最後訓練值。要使用這個模型,須要有3個步驟:測試

       一、  建立一個滑動平均模型對象ui

step = tf.Variable(initial_value=0,dtype=tf.float32,trainable=False)
ema = tf.train.ExponentialMovingAverage(decay=0.99,num_updates=step)

        decay就是前面公式裏面的衰減所以,合理的decay值能夠是接近1.0,例如0.999,0.9999等多個9中變換。num_updates爲一個可選的參數,decay值由以下公式決定:this

min(decay, (1 + num_updates) / (10 + num_updates))。目的是使影子變量在剛開始訓練的時候,更新的更快。 所以num_updates一般能夠傳入一個遞增的訓練步數變量。lua

       二、  加入訓練參數列表到模型中進行維護spa

       新建兩個訓練參數,並將其加入滑動平均模型對象中維護,apply()函數接受一個參數列表。rest

var0 = tf.Variable(initial_value=0,dtype=tf.float32,trainable=False)
var1 = tf.Variable(initial_value=0,dtype=tf.float32,trainable=False)
maintain_averages_op = ema.apply([var0,var1])

       三、  訓練完成之後,更新滑動平均模型中各個影子變量的值code

sess.run(maintain_averages_op)
print(sess.run([var0,ema.average(var0),var1,ema.average(var1)]))  # 輸出[10,4.555,10,9.01]

         完整的滑動平均模型測試樣例以下:

# 導入tensorflow庫
import tensorflow as tf

# 建立一個滑動平均模型對象
step = tf.Variable(initial_value=0,dtype=tf.float32,trainable=False)
ema = tf.train.ExponentialMovingAverage(decay=0.99,num_updates=step)

# 建立兩個訓練參數,並將其加入滑動平均模型對象中,對象會爲這兩個訓練參數建立兩個影子變量
# 影子變量shadow_variable = decay * shadow_variable + (1 - decay) * variable
# 若是滑動平均模型對象建立時,指定了num_updates,則decay = min{decay,(1 + num_updates)/(10 + num_updates)}
var0 = tf.Variable(initial_value=0,dtype=tf.float32,trainable=False)
var1 = tf.Variable(initial_value=0,dtype=tf.float32,trainable=False)
maintain_averages_op = ema.apply([var0,var1])

# 測試更新影子變量值
with tf.Session() as sess:
    init = tf.global_variables_initializer()
    sess.run(init)

    # 第一次初始滑動平均
    sess.run(maintain_averages_op)
    # decay = min(0.99,0.1) = 0.1
    # 初始時:
    # shadow_variable_var0 = var0 = 0
    # shadow_variable_var1 = var1 = 0
    print(sess.run([var0,ema.average(var0),var1,ema.average(var1)]))

    # 第二次更新滑動平均
    sess.run(tf.assign(var0,5.0))
    sess.run(tf.assign(var1, 10.0))
    # decay = min(0.99,(1+0)/(10+0)) = 0.1
    # shadow_variable_var0 = decay * shadow_variable + (1 - decay) * variable = 0.1*0 + (1-0.1)*5 = 4.5
    # shadow_variable_var1 = 9.0
    sess.run(maintain_averages_op)
    print(sess.run([var0,ema.average(var0),var1,ema.average(var1)]))  # 輸出[5.0,4.5,10,9.0]

    # 第三次更新滑動平均
    sess.run(tf.assign(step,10000))
    sess.run(tf.assign(var0,10))
    # decay = min(0.99,(1+10000)/(10+10000)) = 0.99
    # shadow_variable_var0 = decay * shadow_variable + (1 - decay) * variable = 0.99*4.5 + (1-0.99)*10 = 4.555
    # shadow_variable_var1 = 0.99*9.0+(1-0.99)*10 = 9.01
    sess.run(maintain_averages_op)
    print(sess.run([var0,ema.average(var0),var1,ema.average(var1)]))  # 輸出[10,4.555,10,9.01]

    # 第四次更新滑動平均
    # decay = min(0.99,(1+10000)/(10+10000)) = 0.99
    # shadow_variable_var0 = decay * shadow_variable + (1 - decay) * variable = 0.99*4.555 + (1-0.99)*10 = 4.60945
    # shadow_variable_var1 = 0.99*9.01+(1-0.99)*10 = 9.0199
    sess.run(maintain_averages_op)
    print(sess.run([var0, ema.average(var0), var1, ema.average(var1)]))  # 輸出[10,4.60945,10,9.0199]

        下面是tensorflow官方給出的兩種滑動模型使用場景:

  Example usage when creating a training model:

  ```python
  # Create variables.
  var0 = tf.Variable(...)
  var1 = tf.Variable(...)
  # ... use the variables to build a training model...
  ...
  # Create an op that applies the optimizer.  This is what we usually
  # would use as a training op.
  opt_op = opt.minimize(my_loss, [var0, var1])

  # Create an ExponentialMovingAverage object
  ema = tf.train.ExponentialMovingAverage(decay=0.9999)

  with tf.control_dependencies([opt_op]):
      # Create the shadow variables, and add ops to maintain moving averages
      # of var0 and var1. This also creates an op that will update the moving
      # averages after each training step.  This is what we will use in place
      # of the usual training op.
      training_op = ema.apply([var0, var1])

  ...train the model by running training_op...
  ```


 There are two ways to use the moving averages for evaluations: * Build a model that uses the shadow variables instead of the variables. For this, use the `average()` method which returns the shadow variable for a given variable. * Build a model normally but load the checkpoint files to evaluate by using the shadow variable names. For this use the `average_name()` method. See the @{tf.train.Saver} for more information on restoring saved variables.

  Example of restoring the shadow variable values:

  ```python
  # Create a Saver that loads variables from their saved shadow values.
  shadow_var0_name = ema.average_name(var0)
  shadow_var1_name = ema.average_name(var1)
  saver = tf.train.Saver({shadow_var0_name: var0, shadow_var1_name: var1})
  saver.restore(...checkpoint filename...)
  # var0 and var1 now hold the moving average values
  ```
  """
相關文章
相關標籤/搜索