訓練神經網絡時的Epoch vs Iteration

時間 2020-03-04

標籤訓練神經網絡 epoch iteration 简体版

原文原文鏈接

訓練多層感知器時，紀元和迭代之間有什麼區別？ html

#1樓

一般狀況下，您將測試集分紅小批量供網絡學習，並逐步完成培訓，逐層應用漸變降低。全部這些小步驟均可以稱爲迭代。算法

一個紀元對應於整個網絡一次經過的整個訓練集。限制這一點多是有用的，例如對抗過分擬合。網絡

#2樓

我相信迭代至關於批量SGD中的單批前向+後推。 Epoch正在經歷整個數據集一次（正如其餘人提到的那樣）。 dom

#3樓

在神經網絡術語中： ide

全部訓練樣例的一個時期 =一個前進和一個後退
批量大小 =一個前向/後向傳遞中的訓練樣本數。批量大小越大，您須要的內存空間就越大。
迭代次數 =次數，每次經過使用[批量大小]數量的示例。要清楚，一次傳球=一次前進傳球+一次後傳傳球（咱們不計算前進傳球和後傳傳球做爲兩次不一樣傳球）。

示例：若是您有1000個訓練示例，而且批量大小爲500，則須要2次迭代才能完成1個時期。學習

僅供參考：權衡批量大小與訓練神經網絡的迭代次數測試

術語「批處理」含糊不清：有些人用它來指定整個訓練集，有些人用它來指代一個前進/後退中的訓練樣例數（正如我在這個答案中所作的那樣）。爲了不這種模糊性並明確批次對應於一個前向/後向傳遞中的訓練樣本的數量，可使用術語小批量 。 lua

#4樓

epoch是用於訓練的樣本子集的迭代，例如，中性網絡中的梯度降低算法。一個很好的參考是： http ： //neuralnetworksanddeeplearning.com/chap1.html spa

請注意，該頁面具備使用紀元的梯度降低算法的代碼 code

def SGD(self, training_data, epochs, mini_batch_size, eta,
        test_data=None):
    """Train the neural network using mini-batch stochastic
    gradient descent.  The "training_data" is a list of tuples
    "(x, y)" representing the training inputs and the desired
    outputs.  The other non-optional parameters are
    self-explanatory.  If "test_data" is provided then the
    network will be evaluated against the test data after each
    epoch, and partial progress printed out.  This is useful for
    tracking progress, but slows things down substantially."""
    if test_data: n_test = len(test_data)
    n = len(training_data)
    for j in xrange(epochs):
        random.shuffle(training_data)
        mini_batches = [
            training_data[k:k+mini_batch_size]
            for k in xrange(0, n, mini_batch_size)]
        for mini_batch in mini_batches:
            self.update_mini_batch(mini_batch, eta)
        if test_data:
            print "Epoch {0}: {1} / {2}".format(
                j, self.evaluate(test_data), n_test)
        else:
            print "Epoch {0} complete".format(j)

看看代碼。對於每一個時期，咱們隨機生成梯度降低算法的輸入子集。爲何epoch有效也在頁面中解釋。請看一下。