PyTorch訓練模型小結

平時用了不少Keras,訓練的時候很是方便,直接model.fit就能夠了。可是PyTorch的訓練得本身寫,這裏小結下PyTorch怎麼訓練模型。python

PyTorch訓練的大致步驟

一個標準的PyTorch模型必須得有一個固定結構的類,結構以下web

class TwoLayerNet(torch.nn.Module):
    def __init__(self, D_in, H, D_out):
        """ In the constructor we instantiate two nn.Linear modules and assign them as member variables. """
        super(TwoLayerNet, self).__init__()
        self.linear1 = torch.nn.Linear(D_in, H)
        self.linear2 = torch.nn.Linear(H, D_out)

    def forward(self, x):
        """ In the forward function we accept a Tensor of input data and we must return a Tensor of output data. We can use Modules defined in the constructor as well as arbitrary operators on Tensors. """
        h_relu = self.linear1(x).clamp(min=0)
        y_pred = self.linear2(h_relu)
        return y_pred

這個固定結構中必須在init裏初始化網絡結構, 而後必須有forward來進行feed forward網絡的前饋。
下一步,初始化模型的類。網絡

model = TwoLayerNet(D_in, H, D_out)

下一步就要選擇損失函數和優化器了svg

# Construct our loss function and an Optimizer. The call to model.parameters()
# in the SGD constructor will contain the learnable parameters of the two
# nn.Linear modules which are members of the model.
criterion = torch.nn.MSELoss(reduction='sum')
optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)

這些都有了以後咱們就能夠進行訓練了
先設一個想要訓練的Epochs.
每一個epoch首先要得到模型前饋獲得的預測值,而後經過目標函數來比較預測和真實值之間的差別。
在運行優化器以前,要運行一下zero_grad(),它能夠清除全部的梯度。
optimizer.step() 會進行一次優化步驟。函數

for t in range(500):
    # Forward pass: Compute predicted y by passing x to the model
    y_pred = model(x)

    # Compute and print loss
    loss = criterion(y_pred, y)
    print(t, loss.item())

    # Zero gradients, perform a backward pass, and update the weights.
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

若是模型中存在 Dropout 或者 Batch Normalization 層的話 就要使用 model.eval() 把 model變爲eval模式, 不然運行的結果會與前一次iteration的結果不同。
model.train()把模型變成訓練模式,在使用PyTorch進行訓練和測試時必定注意要把實例化的model指定train/eval。測試