pytorch構建和訓練深度學習模型的過程當中,每每須要可以直觀的觀測到可視化的過程,好比畫出訓練曲線等。python
對於簡單的曲線繪製可使用matplotlib庫作出基本的圖,若是須要更加高級的可視化過程,pytorch有好幾個工具均可以作到,好比tensorwatch,visdom,tensorboard,實測下來發現tensorboard用起來比較方便和成熟穩定。app
(pytorch自從1.2版本依賴就正式支持了獨立的tensorboard,再也不須要去安裝tensorboardX了)dom
本文經過簡單線性模型訓練來說解如何使用這兩種可視化方法。函數
安裝依賴
這裏用的pytorch1.5 cpu版本,須要額外安裝matplotlib和tensorboard庫工具
pip install matplotlib pip install tensorboard
代碼
引入模塊學習
import torch from torch import nn import torch.utils.data as Data from torch.nn import init import numpy as np import matplotlib.pyplot as plt
準備數據ui
這裏用一個簡單的線性模型爲例url
# real parameter true_w = [2, -3.4] true_b = 4.2
輸入兩個特徵值,輸出一個結果spa
經過隨機數構造批量特徵值矩陣,並增長隨機噪聲.net
# prepare data input_num = 2 # input feature dim output_num = 1 # output result dim num_samples = 1000 features = torch.rand(num_samples, input_num) labels = true_w[0] * features[:, 0] + true_w[1] * features[:, 1] + true_b labels += torch.tensor(np.random.normal(0, 0.01, size=labels.size())) # add random noise batch_size = 10 num_trains = 800 num_tests = 200 train_features = features[:num_trains, :] train_labels = labels[:num_trains] test_features = features[num_trains:, :] test_labels = labels[num_trains:]
定義模型
根據輸入輸出維度構建一個簡單的線性層
# define model class LinearNet(nn.Module): def __init__(self, n_feature, n_output): super(LinearNet, self).__init__() self.linear = nn.Linear(n_feature, n_output) def forward(self, x): y = self.linear(x) return y net = LinearNet(input_num, output_num)
定義訓練參數
lr = 0.03 loss = nn.MSELoss() optimizer = torch.optim.SGD(net.parameters(), lr=lr) num_epochs = 100
經過matplotlib可視化
在訓練過程當中保存中間結果,而後用matplotlib畫出損失曲線
# define draw def plotCurve(x_vals, y_vals, x_label, y_label, x2_vals=None, y2_vals=None, legend=None, figsize=(3.5, 2.5)): # set figsize plt.xlabel(x_label) plt.ylabel(y_label) plt.semilogy(x_vals, y_vals) if x2_vals and y2_vals: plt.semilogy(x2_vals, y2_vals, linestyle=':') if legend: plt.legend(legend)
# train and visualize def train1(net, num_epochs, batch_size, train_features, train_labels, test_features, test_labels, loss, optimizer): print ("=== train begin ===") # data process train_dataset = Data.TensorDataset(train_features, train_labels) test_dataset = Data.TensorDataset(test_features, test_labels) train_iter = Data.DataLoader(train_dataset, batch_size, shuffle=True) test_iter = Data.DataLoader(test_dataset, batch_size, shuffle=True) # train by step train_ls, test_ls = [], [] for epoch in range(num_epochs): for x, y in train_iter: ls = loss(net(x).view(-1, 1), y.view(-1, 1)) optimizer.zero_grad() ls.backward() optimizer.step() # save loss for each step train_ls.append(loss(net(train_features).view(-1, 1), train_labels.view(-1, 1)).item()) test_ls.append(loss(net(test_features).view(-1, 1), test_labels.view(-1, 1)).item()) if (epoch % 10 == 0): print ("epoch %d: train loss %f, test loss %f" % (epoch, train_ls[-1], test_ls[-1])) print ("final epoch: train loss %f, test loss %f" % (train_ls[-1], test_ls[-1])) print ("plot curves") plotCurve(range(1, num_epochs + 1), train_ls, "epoch", "loss", range(1, num_epochs + 1), test_ls, ["train", "test"] ) print ("=== train end ===") train1(net, num_epochs, batch_size, train_features, train_labels, test_features, test_labels, loss, optimizer)
結果以下
=== train begin === epoch 0: train loss 1.163743, test loss 1.123318 epoch 10: train loss 0.002227, test loss 0.001833 epoch 20: train loss 0.000107, test loss 0.000106 epoch 30: train loss 0.000101, test loss 0.000106 epoch 40: train loss 0.000100, test loss 0.000104 epoch 50: train loss 0.000101, test loss 0.000104 epoch 60: train loss 0.000100, test loss 0.000103 epoch 70: train loss 0.000101, test loss 0.000102 epoch 80: train loss 0.000100, test loss 0.000103 epoch 90: train loss 0.000103, test loss 0.000103 final epoch: train loss 0.000100, test loss 0.000102 plot curves === train end ===
查看一下訓練後好的參數
print (net.linear.weight) print (net.linear.bias)
Parameter containing: tensor([[ 1.9967, -3.3978]], requires_grad=True) Parameter containing: tensor([4.2014], requires_grad=True)
經過tensorboard可視化
啓動tensorboard顯示網頁
先引入模塊
# train with tensorboard from torch.utils.tensorboard import SummaryWriter
定義一個writer,記錄的中間數據會落到對應的目錄中去用於顯示
writer = SummaryWriter("runs/linear_experiment") # default at runs folder if not sepecify path
此時用命令行啓動tensorboard網頁服務
tensorboard --logdir=runs
而後打開 http://localhost:6006/ 網頁進入tensorboard主頁面
模型可視化
tensorboard能夠經過傳入一個樣本,寫入graph到日誌中,在網頁上看到模型的結構
sample = train_features[0] writer.add_graph(net, sample)
刷新網頁便可看到
訓練過程可視化
定義新的訓練函數,在過程當中寫入對應的scalar數據,就能夠由tensorboard繪製出曲線
# trian and record scalar def train2(net, num_epochs, batch_size, train_features, train_labels, test_features, test_labels, loss, optimizer): print ("=== train begin ===") # data process train_dataset = Data.TensorDataset(train_features, train_labels) test_dataset = Data.TensorDataset(test_features, test_labels) train_iter = Data.DataLoader(train_dataset, batch_size, shuffle=True) test_iter = Data.DataLoader(test_dataset, batch_size, shuffle=True) # train by step for epoch in range(num_epochs): for x, y in train_iter: ls = loss(net(x).view(-1, 1), y.view(-1, 1)) optimizer.zero_grad() ls.backward() optimizer.step() # save loss for each step train_loss = loss(net(train_features).view(-1, 1), train_labels.view(-1, 1)).item() test_loss = loss(net(test_features).view(-1, 1), test_labels.view(-1, 1)).item() # write to tensorboard writer.add_scalar("train_loss", train_loss, epoch) writer.add_scalar("test_loss", test_loss, epoch) if (epoch % 10 == 0): print ("epoch %d: train loss %f, test loss %f" % (epoch, train_loss, test_loss)) print ("final epoch: train loss %f, test loss %f" % (train_loss, test_loss)) print ("=== train end ===") train2(net, num_epochs, batch_size, train_features, train_labels, test_features, test_labels, loss, optimizer)
結果以下
=== train begin === epoch 0: train loss 0.928869, test loss 0.978139 epoch 10: train loss 0.000948, test loss 0.000902 epoch 20: train loss 0.000102, test loss 0.000104 epoch 30: train loss 0.000101, test loss 0.000105 epoch 40: train loss 0.000101, test loss 0.000102 epoch 50: train loss 0.000100, test loss 0.000103 epoch 60: train loss 0.000101, test loss 0.000105 epoch 70: train loss 0.000102, test loss 0.000103 epoch 80: train loss 0.000104, test loss 0.000110 epoch 90: train loss 0.000100, test loss 0.000103 final epoch: train loss 0.000101, test loss 0.000104 === train end ===
刷新頁面能夠看到
更多高階用法就有待於本身去摸索研究了
