【tf.keras】tf.keras模型復現

keras 構建模型很簡單,上手很方便,同時又是 tensorflow 的高級 API,因此學學也挺好。python

模型復如今咱們的實驗中也挺重要的,跑出了一個模型,雖然咱們能夠將模型的 checkpoint 保存,但再跑一遍,怎麼都得不到相同的結果。web

用 keras 實現模型,想要可以復現,首先須要設置各個可能的隨機過程的 seed,如 np.random.seed(1)。而後分爲兩種狀況:後端

  1. 代碼不要在 GPU 上跑,而是限制在 CPU 上跑,此時能夠自行設置 fit 函數的 batch_size 參數;
  2. 代碼能夠在 GPU 上跑,須要設置 fit 函數的參數 batch_size = 1。(當使用 tf.keras.conv2D() 時,彷佛在 GPU 上跑無法復現,最好使用第一種方式,只在 CPU 上跑。)

個人 tensorflow+keras 版本:api

print(tf.VERSION)    # '1.10.0'
print(tf.keras.__version__)    # '2.1.6-tf'

keras 模型可復現的配置:session

import numpy as np
import tensorflow as tf
import random as rn

import os
# run on CPU only, if you want to run code on GPU, you should delete the following line.
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
os.environ["PYTHONHASHSEED"] = '0'

# The below is necessary for starting Numpy generated random numbers
# in a well-defined initial state.

np.random.seed(42)

# The below is necessary for starting core Python generated random numbers
# in a well-defined state.

rn.seed(12345)

# Force TensorFlow to use single thread.
# Multiple threads are a potential source of non-reproducible results.
# For further details, see: https://stackoverflow.com/questions/42022950/

session_conf = tf.ConfigProto(intra_op_parallelism_threads=1,
                              inter_op_parallelism_threads=1)

from keras import backend as K

# The below tf.set_random_seed() will make random number generation
# in the TensorFlow backend have a well-defined initial state.
# For further details, see:
# https://www.tensorflow.org/api_docs/python/tf/set_random_seed

tf.set_random_seed(1234)

sess = tf.Session(graph=tf.get_default_graph(), config=session_conf)
K.set_session(sess)

# Rest of code follows ...

對於 tensorflow low-level API,即用 tf.variable_scope() 和 tf.get_variable() 自行構建 layers,一樣會出現這種問題。dom

keras 文檔 對此的解釋是:函數

Moreover, when using the TensorFlow backend and running on a GPU, some operations have non-deterministic outputs, in particular tf.reduce_sum(). This is due to the fact that GPUs run many operations in parallel, so the order of execution is not always guaranteed. Due to the limited precision of floats, even adding several numbers together may give slightly different results depending on the order in which you add them. You can try to avoid the non-deterministic operations, but some may be created automatically by TensorFlow to compute the gradients, so it is much simpler to just run the code on the CPU.優化

而 pytorch 是怎麼保證可復現:(cudnn中對卷積操做進行了優化,犧牲了精度來換取計算效率。能夠看到,下面的代碼強制 cudnn 產生肯定性的結果,但會犧牲效率。具體參見博客 PyTorch的可重複性問題 (如何使實驗結果可復現)ui

from torch.backends import cudnn
cudnn.benchmark = False            # if benchmark=True, deterministic will be False
cudnn.deterministic = True

References

How can I obtain reproducible results using Keras during development? -- Keras Documentation
具備Tensorflow後端的Keras能夠隨意使用CPU或GPU嗎?
PyTorch的可重複性問題 (如何使實驗結果可復現)-- hyk_1996.net

相關文章
相關標籤/搜索