平時用了不少Keras,訓練的時候很是方便,直接model.fit就能夠了。可是PyTorch的訓練得本身寫,這裏小結下PyTorch怎麼訓練模型。python
一個標準的PyTorch模型必須得有一個固定結構的類,結構以下web
class TwoLayerNet(torch.nn.Module): def __init__(self, D_in, H, D_out): """ In the constructor we instantiate two nn.Linear modules and assign them as member variables. """ super(TwoLayerNet, self).__init__() self.linear1 = torch.nn.Linear(D_in, H) self.linear2 = torch.nn.Linear(H, D_out) def forward(self, x): """ In the forward function we accept a Tensor of input data and we must return a Tensor of output data. We can use Modules defined in the constructor as well as arbitrary operators on Tensors. """ h_relu = self.linear1(x).clamp(min=0) y_pred = self.linear2(h_relu) return y_pred
這個固定結構中必須在init裏初始化網絡結構, 而後必須有forward來進行feed forward網絡的前饋。
下一步,初始化模型的類。網絡
model = TwoLayerNet(D_in, H, D_out)
下一步就要選擇損失函數和優化器了svg
# Construct our loss function and an Optimizer. The call to model.parameters() # in the SGD constructor will contain the learnable parameters of the two # nn.Linear modules which are members of the model. criterion = torch.nn.MSELoss(reduction='sum') optimizer = torch.optim.SGD(model.parameters(), lr=1e-4)
這些都有了以後咱們就能夠進行訓練了
先設一個想要訓練的Epochs.
每一個epoch首先要得到模型前饋獲得的預測值,而後經過目標函數來比較預測和真實值之間的差別。
在運行優化器以前,要運行一下zero_grad(),它能夠清除全部的梯度。
optimizer.step() 會進行一次優化步驟。函數
for t in range(500): # Forward pass: Compute predicted y by passing x to the model y_pred = model(x) # Compute and print loss loss = criterion(y_pred, y) print(t, loss.item()) # Zero gradients, perform a backward pass, and update the weights. optimizer.zero_grad() loss.backward() optimizer.step()
若是模型中存在 Dropout 或者 Batch Normalization 層的話 就要使用 model.eval() 把 model變爲eval模式, 不然運行的結果會與前一次iteration的結果不同。
model.train()把模型變成訓練模式,在使用PyTorch進行訓練和測試時必定注意要把實例化的model指定train/eval。測試