(原)DropBlock A regularization method for convolutional networks

轉載請註明出處:html

http://www.javashuo.com/article/p-mvzxjzwc-e.htmlgit

論文網址:github

https://arxiv.org/abs/1810.12890算法

第三方實現:網絡

Pytorch:https://github.com/Randl/DropBlock-pytorch框架

Tensorflow:https://github.com/DHZS/tf-dropblockdom

修改後的pytorch實現:https://github.com/darkknightzh/DropBlock_pytorch
ide

整體介紹

傳統的dropout對FC層效果更好,對conv層效果較差,緣由多是conv層的特徵單元是空間相關的,使用dropout隨機丟棄一些特徵,網絡仍舊能從相鄰區域獲取信息,於是輸入的信息仍舊可以被送到下一層,致使網絡過擬合。而論文提出的DropBlock則是將在特徵圖上連續的信息一塊兒丟棄。性能

下圖是一個簡單示例。a爲輸入圖像,狗的頭、腳等區域具備相關性。b爲dropout的方式直接丟棄信息,此時能從臨近區域獲取相關信息(帶x的爲丟棄信息的mask)。c爲DropBlock的方式,將必定區域內的特徵所有丟棄。學習

算法框架

因爲要使用mask,DropBlock有兩種實現方式:① 不一樣特徵層共享相同的DropBlock mask,在相同位置丟棄信息;② 每一個特徵層使用各自的DropBlock mask。文中發現第二種效果更好,下圖是方式②的流程(DropBlock有兩個參數,block_size和γ。block_size爲塊的大小,γ爲伯努利分佈的參數):

說明:

3 若是處於推斷模式,則不丟棄任何特徵

5 生成mask M,每一個點均服從參數爲γ的伯努利分佈(伯努利分佈,隨機變量x以機率γ和1-γ取1和0)。須要注意的是,只有mask的綠色有效區域內的點服從該分佈,以下圖a所示,這樣確保步驟6不會處理邊緣區域。

6 對於M中爲0的點,以該點爲中心,建立一個長寬均爲block_size的矩形,該矩形內全部值均置0。如上圖b所示。

7 將mask應用於特徵圖上:A=A*M

8 對特徵進行歸一化:A=A*count(M)/count_ones(M)。此處count指M的像素數(即特徵圖的寬*高),count_ones指M中值爲1的像素數。其餘文獻看得少,此處不知道怎麼解釋了。反正是對特徵值放大一點點。。。

具體細節

block_size:全部特徵圖的block_size均相同,文中爲7時大部分狀況下最好。

γ:並未直接設置γ,而是從keep_prob來推斷γ,這二者關係以下(feat_size爲特徵圖的大小):

$\gamma =\frac{1-keep\_prob}{block\_siz{{e}^{2}}}\frac{feat\_siz{{e}^{2}}}{{{(feat\_size-block\_size+1)}^{2}}}$

keep_prob:固定keep_prob時DropBlock效果很差,而在訓練初期使用太小的keep_prob會又會影響學習到的參數。於是隨着訓練的進行DropBlock將keep_prob從1線性下降到目標值(如0.9)

實驗結果

具體見論文吧。

只說一下,在resnet 50上測試時,將訓練epochs從90輪增長到270輪,而且在12五、200、250輪分別將學習率下降到原來的1/10。另外一方面,因爲訓練次數太多,於是進行比較的baseline,不是使用最後的結果(訓練次數過多,模型可能過擬合,驗證集上性能可能會下降),而是每次訓練都會計算在驗證集上的性能,並取最好的做爲baseline的性能。

五 pytorch代碼及修改

1. 使用conv2D來獲得mask

參考網址中給出了pytorch的代碼,缺點是須要手動下降keep_prob(以下):

 1 class DropBlock2D(nn.Module):
 2     r"""Randomly zeroes spatial blocks of the input tensor.
 3 
 4 
 5     As described in the paper
 6     `DropBlock: A regularization method for convolutional networks`_ ,
 7     dropping whole blocks of feature map allows to remove semantic
 8     information as compared to regular dropout.
 9 
10     Args:
11         keep_prob (float, optional): probability of an element to be kept.
12         Authors recommend to linearly decrease this value from 1 to desired
13         value.
14         block_size (int, optional): size of the block. Block size in paper
15         usually equals last feature map dimensions.
16 
17     Shape:
18         - Input: :math:`(N, C, H, W)`
19         - Output: :math:`(N, C, H, W)` (same shape as input)
20 
21     .. _DropBlock: A regularization method for convolutional networks:
22        https://arxiv.org/abs/1810.12890
23     """
24 
25     def __init__(self, keep_prob=0.9, block_size=7):
26         super(DropBlock2D, self).__init__()
27         self.keep_prob = keep_prob
28         self.block_size = block_size
29 
30     def forward(self, input):
31         if not self.training or self.keep_prob == 1:
32             return input
33         gamma = (1. - self.keep_prob) / self.block_size ** 2
34         for sh in input.shape[2:]:
35             gamma *= sh / (sh - self.block_size + 1)
36         M = torch.bernoulli(torch.ones_like(input) * gamma)
37         Msum = F.conv2d(M, torch.ones((input.shape[1], 1, self.block_size, self.block_size)).to(device=input.device, dtype=input.dtype),
38                         padding=self.block_size // 2, groups=input.shape[1])
39         
40         torch.set_printoptions(threshold=5000)
41         mask = (Msum < 1).to(device=input.device, dtype=input.dtype)
42         
43         return input * mask * mask.numel() /mask.sum() #TODO input * mask * self.keep_prob ?

代碼中使用conv2D來獲得mask。

33-35行獲得gamma。

36行獲得和輸入尺寸相同的,且符合伯努利分佈的矩陣M

37,38,41行經過conv2d獲得mask。

2. 使用max_pool2d來獲得mask

能夠結合tensorflow的代碼,使用max_pool2d來獲得mask,以下面代碼所示:

 1 class DropBlock2DV2(nn.Module):
 2     def __init__(self, keep_prob=0.9, block_size=7):
 3         super(DropBlock2DV2, self).__init__()
 4         self.keep_prob = keep_prob
 5         self.block_size = block_size
 6 
 7     def forward(self, input):
 8         if not self.training or self.keep_prob == 1:
 9             return input
10         gamma = (1. - self.keep_prob) / self.block_size ** 2
11         for sh in input.shape[2:]:
12             gamma *= sh / (sh - self.block_size + 1)
13         M = torch.bernoulli(torch.ones_like(input) * gamma)
14         
15         Msum = F.max_pool2d(M, kernel_size=[self.block_size, self.block_size], stride=1, padding=self.block_size // 2)
16 
17         mask = (1 - Msum).to(device=input.device, dtype=input.dtype)
18         
19         return input * mask * mask.numel() /mask.sum()

3. 結果對比

下圖爲直接對mnist圖像使用DropBlock2D獲得的伯努利分佈M(兩種代碼結果一致)

下圖爲獲得的Msum(兩種代碼結果一致)

上面結果均相同。

當對中間層使用DropBlock2D時,第一組爲M,第二組爲Msum,第三組爲mask:

當對中間層使用DropBlock2DV2時,第一組爲M,第二組爲Msum,第三組爲mask,

可見,雖然Msum不同,可是mask結果同樣。

下面爲使用DropBlock2D訓練結果:

Train Epoch: 1 [0/60000 (0%)]	Loss: 2.361322
Train Epoch: 1 [2560/60000 (4%)]	Loss: 2.341034
Train Epoch: 1 [5120/60000 (9%)]	Loss: 2.287267
Train Epoch: 1 [7680/60000 (13%)]	Loss: 2.274684
Train Epoch: 1 [10240/60000 (17%)]	Loss: 2.260440
Train Epoch: 1 [12800/60000 (21%)]	Loss: 2.259492
Train Epoch: 1 [15360/60000 (26%)]	Loss: 2.240603
Train Epoch: 1 [17920/60000 (30%)]	Loss: 2.207781
Train Epoch: 1 [20480/60000 (34%)]	Loss: 2.177025
Train Epoch: 1 [23040/60000 (38%)]	Loss: 2.137965
Train Epoch: 1 [25600/60000 (43%)]	Loss: 2.029636
Train Epoch: 1 [28160/60000 (47%)]	Loss: 1.967242
Train Epoch: 1 [30720/60000 (51%)]	Loss: 1.948034
Train Epoch: 1 [33280/60000 (55%)]	Loss: 1.856952
Train Epoch: 1 [35840/60000 (60%)]	Loss: 1.786148
Train Epoch: 1 [38400/60000 (64%)]	Loss: 1.657498
Train Epoch: 1 [40960/60000 (68%)]	Loss: 1.604034
Train Epoch: 1 [43520/60000 (72%)]	Loss: 1.550600
Train Epoch: 1 [46080/60000 (77%)]	Loss: 1.425241
Train Epoch: 1 [48640/60000 (81%)]	Loss: 1.455231
Train Epoch: 1 [51200/60000 (85%)]	Loss: 1.263960
Train Epoch: 1 [53760/60000 (89%)]	Loss: 1.255704
Train Epoch: 1 [56320/60000 (94%)]	Loss: 1.156345
Train Epoch: 1 [58880/60000 (98%)]	Loss: 1.272000

Test set: Average loss: 0.7381, Accuracy: 8467/10000 (85%)

Train Epoch: 2 [0/60000 (0%)]	Loss: 1.198182
Train Epoch: 2 [2560/60000 (4%)]	Loss: 1.133391
Train Epoch: 2 [5120/60000 (9%)]	Loss: 0.958502
Train Epoch: 2 [7680/60000 (13%)]	Loss: 1.068815
Train Epoch: 2 [10240/60000 (17%)]	Loss: 1.103716
Train Epoch: 2 [12800/60000 (21%)]	Loss: 0.949479
Train Epoch: 2 [15360/60000 (26%)]	Loss: 0.931306
Train Epoch: 2 [17920/60000 (30%)]	Loss: 0.922673
Train Epoch: 2 [20480/60000 (34%)]	Loss: 0.954674
Train Epoch: 2 [23040/60000 (38%)]	Loss: 0.875961
Train Epoch: 2 [25600/60000 (43%)]	Loss: 0.892115
Train Epoch: 2 [28160/60000 (47%)]	Loss: 0.835847
Train Epoch: 2 [30720/60000 (51%)]	Loss: 0.847458
Train Epoch: 2 [33280/60000 (55%)]	Loss: 0.791372
Train Epoch: 2 [35840/60000 (60%)]	Loss: 0.856625
Train Epoch: 2 [38400/60000 (64%)]	Loss: 0.717881
Train Epoch: 2 [40960/60000 (68%)]	Loss: 0.752228
Train Epoch: 2 [43520/60000 (72%)]	Loss: 0.803537
Train Epoch: 2 [46080/60000 (77%)]	Loss: 0.756578
Train Epoch: 2 [48640/60000 (81%)]	Loss: 0.751846
Train Epoch: 2 [51200/60000 (85%)]	Loss: 0.717995
Train Epoch: 2 [53760/60000 (89%)]	Loss: 0.780597
Train Epoch: 2 [56320/60000 (94%)]	Loss: 0.731817
Train Epoch: 2 [58880/60000 (98%)]	Loss: 0.689091

Test set: Average loss: 0.3513, Accuracy: 9160/10000 (92%)

Train Epoch: 3 [0/60000 (0%)]	Loss: 0.637277
Train Epoch: 3 [2560/60000 (4%)]	Loss: 0.669868
Train Epoch: 3 [5120/60000 (9%)]	Loss: 0.628944
Train Epoch: 3 [7680/60000 (13%)]	Loss: 0.577496
Train Epoch: 3 [10240/60000 (17%)]	Loss: 0.592267
Train Epoch: 3 [12800/60000 (21%)]	Loss: 0.653577
Train Epoch: 3 [15360/60000 (26%)]	Loss: 0.665416
Train Epoch: 3 [17920/60000 (30%)]	Loss: 0.586191
Train Epoch: 3 [20480/60000 (34%)]	Loss: 0.608021
Train Epoch: 3 [23040/60000 (38%)]	Loss: 0.687553
Train Epoch: 3 [25600/60000 (43%)]	Loss: 0.543245
Train Epoch: 3 [28160/60000 (47%)]	Loss: 0.709381
Train Epoch: 3 [30720/60000 (51%)]	Loss: 0.593673
Train Epoch: 3 [33280/60000 (55%)]	Loss: 0.578504
Train Epoch: 3 [35840/60000 (60%)]	Loss: 0.549813
Train Epoch: 3 [38400/60000 (64%)]	Loss: 0.525090
Train Epoch: 3 [40960/60000 (68%)]	Loss: 0.660547
Train Epoch: 3 [43520/60000 (72%)]	Loss: 0.453336
Train Epoch: 3 [46080/60000 (77%)]	Loss: 0.564345
Train Epoch: 3 [48640/60000 (81%)]	Loss: 0.553669
Train Epoch: 3 [51200/60000 (85%)]	Loss: 0.553587
Train Epoch: 3 [53760/60000 (89%)]	Loss: 0.606980
Train Epoch: 3 [56320/60000 (94%)]	Loss: 0.595271
Train Epoch: 3 [58880/60000 (98%)]	Loss: 0.544352

Test set: Average loss: 0.2539, Accuracy: 9336/10000 (93%)

Train Epoch: 4 [0/60000 (0%)]	Loss: 0.620331
Train Epoch: 4 [2560/60000 (4%)]	Loss: 0.567480
Train Epoch: 4 [5120/60000 (9%)]	Loss: 0.508343
Train Epoch: 4 [7680/60000 (13%)]	Loss: 0.494243
Train Epoch: 4 [10240/60000 (17%)]	Loss: 0.520037
Train Epoch: 4 [12800/60000 (21%)]	Loss: 0.495483
Train Epoch: 4 [15360/60000 (26%)]	Loss: 0.469755
Train Epoch: 4 [17920/60000 (30%)]	Loss: 0.593844
Train Epoch: 4 [20480/60000 (34%)]	Loss: 0.403337
Train Epoch: 4 [23040/60000 (38%)]	Loss: 0.546628
Train Epoch: 4 [25600/60000 (43%)]	Loss: 0.521818
Train Epoch: 4 [28160/60000 (47%)]	Loss: 0.461603
Train Epoch: 4 [30720/60000 (51%)]	Loss: 0.451833
Train Epoch: 4 [33280/60000 (55%)]	Loss: 0.439977
Train Epoch: 4 [35840/60000 (60%)]	Loss: 0.529679
Train Epoch: 4 [38400/60000 (64%)]	Loss: 0.489150
Train Epoch: 4 [40960/60000 (68%)]	Loss: 0.542619
Train Epoch: 4 [43520/60000 (72%)]	Loss: 0.471994
Train Epoch: 4 [46080/60000 (77%)]	Loss: 0.438699
Train Epoch: 4 [48640/60000 (81%)]	Loss: 0.415263
Train Epoch: 4 [51200/60000 (85%)]	Loss: 0.391874
Train Epoch: 4 [53760/60000 (89%)]	Loss: 0.521654
Train Epoch: 4 [56320/60000 (94%)]	Loss: 0.433007
Train Epoch: 4 [58880/60000 (98%)]	Loss: 0.388784

Test set: Average loss: 0.2137, Accuracy: 9432/10000 (94%)

Train Epoch: 5 [0/60000 (0%)]	Loss: 0.531463
Train Epoch: 5 [2560/60000 (4%)]	Loss: 0.501781
Train Epoch: 5 [5120/60000 (9%)]	Loss: 0.392028
Train Epoch: 5 [7680/60000 (13%)]	Loss: 0.506029
Train Epoch: 5 [10240/60000 (17%)]	Loss: 0.437021
Train Epoch: 5 [12800/60000 (21%)]	Loss: 0.447683
Train Epoch: 5 [15360/60000 (26%)]	Loss: 0.392386
Train Epoch: 5 [17920/60000 (30%)]	Loss: 0.416837
Train Epoch: 5 [20480/60000 (34%)]	Loss: 0.451183
Train Epoch: 5 [23040/60000 (38%)]	Loss: 0.433644
Train Epoch: 5 [25600/60000 (43%)]	Loss: 0.470782
Train Epoch: 5 [28160/60000 (47%)]	Loss: 0.421898
Train Epoch: 5 [30720/60000 (51%)]	Loss: 0.445305
Train Epoch: 5 [33280/60000 (55%)]	Loss: 0.479172
Train Epoch: 5 [35840/60000 (60%)]	Loss: 0.375300
Train Epoch: 5 [38400/60000 (64%)]	Loss: 0.370561
Train Epoch: 5 [40960/60000 (68%)]	Loss: 0.532634
Train Epoch: 5 [43520/60000 (72%)]	Loss: 0.383319
Train Epoch: 5 [46080/60000 (77%)]	Loss: 0.506851
Train Epoch: 5 [48640/60000 (81%)]	Loss: 0.408551
Train Epoch: 5 [51200/60000 (85%)]	Loss: 0.392309
Train Epoch: 5 [53760/60000 (89%)]	Loss: 0.477996
Train Epoch: 5 [56320/60000 (94%)]	Loss: 0.414231
Train Epoch: 5 [58880/60000 (98%)]	Loss: 0.430027

Test set: Average loss: 0.1803, Accuracy: 9499/10000 (95%)

Train Epoch: 6 [0/60000 (0%)]	Loss: 0.415102
Train Epoch: 6 [2560/60000 (4%)]	Loss: 0.281096
Train Epoch: 6 [5120/60000 (9%)]	Loss: 0.363706
Train Epoch: 6 [7680/60000 (13%)]	Loss: 0.390821
Train Epoch: 6 [10240/60000 (17%)]	Loss: 0.408141
Train Epoch: 6 [12800/60000 (21%)]	Loss: 0.422376
Train Epoch: 6 [15360/60000 (26%)]	Loss: 0.394143
Train Epoch: 6 [17920/60000 (30%)]	Loss: 0.375477
Train Epoch: 6 [20480/60000 (34%)]	Loss: 0.525811
Train Epoch: 6 [23040/60000 (38%)]	Loss: 0.497835
Train Epoch: 6 [25600/60000 (43%)]	Loss: 0.407303
Train Epoch: 6 [28160/60000 (47%)]	Loss: 0.464176
Train Epoch: 6 [30720/60000 (51%)]	Loss: 0.538057
Train Epoch: 6 [33280/60000 (55%)]	Loss: 0.390377
Train Epoch: 6 [35840/60000 (60%)]	Loss: 0.403131
Train Epoch: 6 [38400/60000 (64%)]	Loss: 0.512176
Train Epoch: 6 [40960/60000 (68%)]	Loss: 0.378995
Train Epoch: 6 [43520/60000 (72%)]	Loss: 0.485860
Train Epoch: 6 [46080/60000 (77%)]	Loss: 0.388245
Train Epoch: 6 [48640/60000 (81%)]	Loss: 0.388625
Train Epoch: 6 [51200/60000 (85%)]	Loss: 0.379450
Train Epoch: 6 [53760/60000 (89%)]	Loss: 0.407995
Train Epoch: 6 [56320/60000 (94%)]	Loss: 0.398069
Train Epoch: 6 [58880/60000 (98%)]	Loss: 0.372017

Test set: Average loss: 0.1640, Accuracy: 9545/10000 (95%)

Train Epoch: 7 [0/60000 (0%)]	Loss: 0.357198
Train Epoch: 7 [2560/60000 (4%)]	Loss: 0.393828
Train Epoch: 7 [5120/60000 (9%)]	Loss: 0.400493
Train Epoch: 7 [7680/60000 (13%)]	Loss: 0.352309
Train Epoch: 7 [10240/60000 (17%)]	Loss: 0.357330
Train Epoch: 7 [12800/60000 (21%)]	Loss: 0.320281
Train Epoch: 7 [15360/60000 (26%)]	Loss: 0.440437
Train Epoch: 7 [17920/60000 (30%)]	Loss: 0.388508
Train Epoch: 7 [20480/60000 (34%)]	Loss: 0.403732
Train Epoch: 7 [23040/60000 (38%)]	Loss: 0.322718
Train Epoch: 7 [25600/60000 (43%)]	Loss: 0.372186
Train Epoch: 7 [28160/60000 (47%)]	Loss: 0.383310
Train Epoch: 7 [30720/60000 (51%)]	Loss: 0.390324
Train Epoch: 7 [33280/60000 (55%)]	Loss: 0.397030
Train Epoch: 7 [35840/60000 (60%)]	Loss: 0.285954
Train Epoch: 7 [38400/60000 (64%)]	Loss: 0.353880
Train Epoch: 7 [40960/60000 (68%)]	Loss: 0.240818
Train Epoch: 7 [43520/60000 (72%)]	Loss: 0.425822
Train Epoch: 7 [46080/60000 (77%)]	Loss: 0.457441
Train Epoch: 7 [48640/60000 (81%)]	Loss: 0.375786
Train Epoch: 7 [51200/60000 (85%)]	Loss: 0.365595
Train Epoch: 7 [53760/60000 (89%)]	Loss: 0.364900
Train Epoch: 7 [56320/60000 (94%)]	Loss: 0.266271
Train Epoch: 7 [58880/60000 (98%)]	Loss: 0.442880

Test set: Average loss: 0.1536, Accuracy: 9577/10000 (96%)

Train Epoch: 8 [0/60000 (0%)]	Loss: 0.371176
Train Epoch: 8 [2560/60000 (4%)]	Loss: 0.291452
Train Epoch: 8 [5120/60000 (9%)]	Loss: 0.416313
Train Epoch: 8 [7680/60000 (13%)]	Loss: 0.366298
Train Epoch: 8 [10240/60000 (17%)]	Loss: 0.296640
Train Epoch: 8 [12800/60000 (21%)]	Loss: 0.348865
Train Epoch: 8 [15360/60000 (26%)]	Loss: 0.417321
Train Epoch: 8 [17920/60000 (30%)]	Loss: 0.367058
Train Epoch: 8 [20480/60000 (34%)]	Loss: 0.336347
Train Epoch: 8 [23040/60000 (38%)]	Loss: 0.362581
Train Epoch: 8 [25600/60000 (43%)]	Loss: 0.288644
Train Epoch: 8 [28160/60000 (47%)]	Loss: 0.330283
Train Epoch: 8 [30720/60000 (51%)]	Loss: 0.298797
Train Epoch: 8 [33280/60000 (55%)]	Loss: 0.344115
Train Epoch: 8 [35840/60000 (60%)]	Loss: 0.356521
Train Epoch: 8 [38400/60000 (64%)]	Loss: 0.297551
Train Epoch: 8 [40960/60000 (68%)]	Loss: 0.440755
Train Epoch: 8 [43520/60000 (72%)]	Loss: 0.364442
Train Epoch: 8 [46080/60000 (77%)]	Loss: 0.272985
Train Epoch: 8 [48640/60000 (81%)]	Loss: 0.342815
Train Epoch: 8 [51200/60000 (85%)]	Loss: 0.329443
Train Epoch: 8 [53760/60000 (89%)]	Loss: 0.257021
Train Epoch: 8 [56320/60000 (94%)]	Loss: 0.350975
Train Epoch: 8 [58880/60000 (98%)]	Loss: 0.252117

Test set: Average loss: 0.1425, Accuracy: 9592/10000 (96%)

Train Epoch: 9 [0/60000 (0%)]	Loss: 0.308571
Train Epoch: 9 [2560/60000 (4%)]	Loss: 0.354496
Train Epoch: 9 [5120/60000 (9%)]	Loss: 0.454645
Train Epoch: 9 [7680/60000 (13%)]	Loss: 0.340073
Train Epoch: 9 [10240/60000 (17%)]	Loss: 0.304953
Train Epoch: 9 [12800/60000 (21%)]	Loss: 0.266210
Train Epoch: 9 [15360/60000 (26%)]	Loss: 0.414784
Train Epoch: 9 [17920/60000 (30%)]	Loss: 0.324893
Train Epoch: 9 [20480/60000 (34%)]	Loss: 0.367169
Train Epoch: 9 [23040/60000 (38%)]	Loss: 0.346932
Train Epoch: 9 [25600/60000 (43%)]	Loss: 0.382222
Train Epoch: 9 [28160/60000 (47%)]	Loss: 0.356705
Train Epoch: 9 [30720/60000 (51%)]	Loss: 0.287982
Train Epoch: 9 [33280/60000 (55%)]	Loss: 0.280525
Train Epoch: 9 [35840/60000 (60%)]	Loss: 0.244508
Train Epoch: 9 [38400/60000 (64%)]	Loss: 0.290698
Train Epoch: 9 [40960/60000 (68%)]	Loss: 0.352147
Train Epoch: 9 [43520/60000 (72%)]	Loss: 0.352036
Train Epoch: 9 [46080/60000 (77%)]	Loss: 0.398510
Train Epoch: 9 [48640/60000 (81%)]	Loss: 0.291793
Train Epoch: 9 [51200/60000 (85%)]	Loss: 0.276297
Train Epoch: 9 [53760/60000 (89%)]	Loss: 0.345035
Train Epoch: 9 [56320/60000 (94%)]	Loss: 0.246514
Train Epoch: 9 [58880/60000 (98%)]	Loss: 0.306455

Test set: Average loss: 0.1283, Accuracy: 9633/10000 (96%)

Train Epoch: 10 [0/60000 (0%)]	Loss: 0.217172
Train Epoch: 10 [2560/60000 (4%)]	Loss: 0.325854
Train Epoch: 10 [5120/60000 (9%)]	Loss: 0.344469
Train Epoch: 10 [7680/60000 (13%)]	Loss: 0.270222
Train Epoch: 10 [10240/60000 (17%)]	Loss: 0.369047
Train Epoch: 10 [12800/60000 (21%)]	Loss: 0.422427
Train Epoch: 10 [15360/60000 (26%)]	Loss: 0.279517
Train Epoch: 10 [17920/60000 (30%)]	Loss: 0.290899
Train Epoch: 10 [20480/60000 (34%)]	Loss: 0.312549
Train Epoch: 10 [23040/60000 (38%)]	Loss: 0.253973
Train Epoch: 10 [25600/60000 (43%)]	Loss: 0.304302
Train Epoch: 10 [28160/60000 (47%)]	Loss: 0.287465
Train Epoch: 10 [30720/60000 (51%)]	Loss: 0.238241
Train Epoch: 10 [33280/60000 (55%)]	Loss: 0.431481
Train Epoch: 10 [35840/60000 (60%)]	Loss: 0.208366
Train Epoch: 10 [38400/60000 (64%)]	Loss: 0.290634
Train Epoch: 10 [40960/60000 (68%)]	Loss: 0.279229
Train Epoch: 10 [43520/60000 (72%)]	Loss: 0.297195
Train Epoch: 10 [46080/60000 (77%)]	Loss: 0.251031
Train Epoch: 10 [48640/60000 (81%)]	Loss: 0.311252
Train Epoch: 10 [51200/60000 (85%)]	Loss: 0.391167
Train Epoch: 10 [53760/60000 (89%)]	Loss: 0.389775
Train Epoch: 10 [56320/60000 (94%)]	Loss: 0.315159
Train Epoch: 10 [58880/60000 (98%)]	Loss: 0.249528

Test set: Average loss: 0.1238, Accuracy: 9652/10000 (97%)
View Code

下面爲使用DropBlock2DV2訓練結果:

Train Epoch: 1 [0/60000 (0%)]	Loss: 2.361322
Train Epoch: 1 [2560/60000 (4%)]	Loss: 2.341034
Train Epoch: 1 [5120/60000 (9%)]	Loss: 2.287267
Train Epoch: 1 [7680/60000 (13%)]	Loss: 2.274684
Train Epoch: 1 [10240/60000 (17%)]	Loss: 2.260440
Train Epoch: 1 [12800/60000 (21%)]	Loss: 2.259492
Train Epoch: 1 [15360/60000 (26%)]	Loss: 2.240603
Train Epoch: 1 [17920/60000 (30%)]	Loss: 2.207781
Train Epoch: 1 [20480/60000 (34%)]	Loss: 2.177025
Train Epoch: 1 [23040/60000 (38%)]	Loss: 2.137965
Train Epoch: 1 [25600/60000 (43%)]	Loss: 2.029636
Train Epoch: 1 [28160/60000 (47%)]	Loss: 1.967242
Train Epoch: 1 [30720/60000 (51%)]	Loss: 1.948036
Train Epoch: 1 [33280/60000 (55%)]	Loss: 1.856993
Train Epoch: 1 [35840/60000 (60%)]	Loss: 1.786044
Train Epoch: 1 [38400/60000 (64%)]	Loss: 1.657677
Train Epoch: 1 [40960/60000 (68%)]	Loss: 1.603945
Train Epoch: 1 [43520/60000 (72%)]	Loss: 1.550625
Train Epoch: 1 [46080/60000 (77%)]	Loss: 1.424808
Train Epoch: 1 [48640/60000 (81%)]	Loss: 1.454958
Train Epoch: 1 [51200/60000 (85%)]	Loss: 1.263500
Train Epoch: 1 [53760/60000 (89%)]	Loss: 1.255482
Train Epoch: 1 [56320/60000 (94%)]	Loss: 1.157445
Train Epoch: 1 [58880/60000 (98%)]	Loss: 1.271838

Test set: Average loss: 0.7383, Accuracy: 8467/10000 (85%)

Train Epoch: 2 [0/60000 (0%)]	Loss: 1.198318
Train Epoch: 2 [2560/60000 (4%)]	Loss: 1.133464
Train Epoch: 2 [5120/60000 (9%)]	Loss: 0.957895
Train Epoch: 2 [7680/60000 (13%)]	Loss: 1.068100
Train Epoch: 2 [10240/60000 (17%)]	Loss: 1.103465
Train Epoch: 2 [12800/60000 (21%)]	Loss: 0.948980
Train Epoch: 2 [15360/60000 (26%)]	Loss: 0.931363
Train Epoch: 2 [17920/60000 (30%)]	Loss: 0.922501
Train Epoch: 2 [20480/60000 (34%)]	Loss: 0.954827
Train Epoch: 2 [23040/60000 (38%)]	Loss: 0.876058
Train Epoch: 2 [25600/60000 (43%)]	Loss: 0.891441
Train Epoch: 2 [28160/60000 (47%)]	Loss: 0.835737
Train Epoch: 2 [30720/60000 (51%)]	Loss: 0.847362
Train Epoch: 2 [33280/60000 (55%)]	Loss: 0.790432
Train Epoch: 2 [35840/60000 (60%)]	Loss: 0.857441
Train Epoch: 2 [38400/60000 (64%)]	Loss: 0.718644
Train Epoch: 2 [40960/60000 (68%)]	Loss: 0.751785
Train Epoch: 2 [43520/60000 (72%)]	Loss: 0.803771
Train Epoch: 2 [46080/60000 (77%)]	Loss: 0.754844
Train Epoch: 2 [48640/60000 (81%)]	Loss: 0.751976
Train Epoch: 2 [51200/60000 (85%)]	Loss: 0.717965
Train Epoch: 2 [53760/60000 (89%)]	Loss: 0.781195
Train Epoch: 2 [56320/60000 (94%)]	Loss: 0.730536
Train Epoch: 2 [58880/60000 (98%)]	Loss: 0.689717

Test set: Average loss: 0.3512, Accuracy: 9149/10000 (91%)

Train Epoch: 3 [0/60000 (0%)]	Loss: 0.637135
Train Epoch: 3 [2560/60000 (4%)]	Loss: 0.669247
Train Epoch: 3 [5120/60000 (9%)]	Loss: 0.628941
Train Epoch: 3 [7680/60000 (13%)]	Loss: 0.577849
Train Epoch: 3 [10240/60000 (17%)]	Loss: 0.592452
Train Epoch: 3 [12800/60000 (21%)]	Loss: 0.653468
Train Epoch: 3 [15360/60000 (26%)]	Loss: 0.664662
Train Epoch: 3 [17920/60000 (30%)]	Loss: 0.587736
Train Epoch: 3 [20480/60000 (34%)]	Loss: 0.608644
Train Epoch: 3 [23040/60000 (38%)]	Loss: 0.687277
Train Epoch: 3 [25600/60000 (43%)]	Loss: 0.544342
Train Epoch: 3 [28160/60000 (47%)]	Loss: 0.708621
Train Epoch: 3 [30720/60000 (51%)]	Loss: 0.593251
Train Epoch: 3 [33280/60000 (55%)]	Loss: 0.579146
Train Epoch: 3 [35840/60000 (60%)]	Loss: 0.549925
Train Epoch: 3 [38400/60000 (64%)]	Loss: 0.525394
Train Epoch: 3 [40960/60000 (68%)]	Loss: 0.661625
Train Epoch: 3 [43520/60000 (72%)]	Loss: 0.455882
Train Epoch: 3 [46080/60000 (77%)]	Loss: 0.563778
Train Epoch: 3 [48640/60000 (81%)]	Loss: 0.553182
Train Epoch: 3 [51200/60000 (85%)]	Loss: 0.553937
Train Epoch: 3 [53760/60000 (89%)]	Loss: 0.606809
Train Epoch: 3 [56320/60000 (94%)]	Loss: 0.594416
Train Epoch: 3 [58880/60000 (98%)]	Loss: 0.544529

Test set: Average loss: 0.2543, Accuracy: 9338/10000 (93%)

Train Epoch: 4 [0/60000 (0%)]	Loss: 0.619532
Train Epoch: 4 [2560/60000 (4%)]	Loss: 0.567222
Train Epoch: 4 [5120/60000 (9%)]	Loss: 0.508649
Train Epoch: 4 [7680/60000 (13%)]	Loss: 0.494902
Train Epoch: 4 [10240/60000 (17%)]	Loss: 0.521278
Train Epoch: 4 [12800/60000 (21%)]	Loss: 0.495832
Train Epoch: 4 [15360/60000 (26%)]	Loss: 0.468417
Train Epoch: 4 [17920/60000 (30%)]	Loss: 0.595662
Train Epoch: 4 [20480/60000 (34%)]	Loss: 0.403730
Train Epoch: 4 [23040/60000 (38%)]	Loss: 0.547263
Train Epoch: 4 [25600/60000 (43%)]	Loss: 0.523064
Train Epoch: 4 [28160/60000 (47%)]	Loss: 0.460831
Train Epoch: 4 [30720/60000 (51%)]	Loss: 0.452652
Train Epoch: 4 [33280/60000 (55%)]	Loss: 0.439493
Train Epoch: 4 [35840/60000 (60%)]	Loss: 0.528650
Train Epoch: 4 [38400/60000 (64%)]	Loss: 0.487770
Train Epoch: 4 [40960/60000 (68%)]	Loss: 0.540879
Train Epoch: 4 [43520/60000 (72%)]	Loss: 0.470456
Train Epoch: 4 [46080/60000 (77%)]	Loss: 0.437475
Train Epoch: 4 [48640/60000 (81%)]	Loss: 0.415573
Train Epoch: 4 [51200/60000 (85%)]	Loss: 0.392564
Train Epoch: 4 [53760/60000 (89%)]	Loss: 0.521458
Train Epoch: 4 [56320/60000 (94%)]	Loss: 0.433528
Train Epoch: 4 [58880/60000 (98%)]	Loss: 0.389609

Test set: Average loss: 0.2143, Accuracy: 9432/10000 (94%)

Train Epoch: 5 [0/60000 (0%)]	Loss: 0.531064
Train Epoch: 5 [2560/60000 (4%)]	Loss: 0.501403
Train Epoch: 5 [5120/60000 (9%)]	Loss: 0.393153
Train Epoch: 5 [7680/60000 (13%)]	Loss: 0.507973
Train Epoch: 5 [10240/60000 (17%)]	Loss: 0.437351
Train Epoch: 5 [12800/60000 (21%)]	Loss: 0.449640
Train Epoch: 5 [15360/60000 (26%)]	Loss: 0.393346
Train Epoch: 5 [17920/60000 (30%)]	Loss: 0.416081
Train Epoch: 5 [20480/60000 (34%)]	Loss: 0.449489
Train Epoch: 5 [23040/60000 (38%)]	Loss: 0.432535
Train Epoch: 5 [25600/60000 (43%)]	Loss: 0.470373
Train Epoch: 5 [28160/60000 (47%)]	Loss: 0.421085
Train Epoch: 5 [30720/60000 (51%)]	Loss: 0.445716
Train Epoch: 5 [33280/60000 (55%)]	Loss: 0.478192
Train Epoch: 5 [35840/60000 (60%)]	Loss: 0.374121
Train Epoch: 5 [38400/60000 (64%)]	Loss: 0.369466
Train Epoch: 5 [40960/60000 (68%)]	Loss: 0.533549
Train Epoch: 5 [43520/60000 (72%)]	Loss: 0.383733
Train Epoch: 5 [46080/60000 (77%)]	Loss: 0.507195
Train Epoch: 5 [48640/60000 (81%)]	Loss: 0.407680
Train Epoch: 5 [51200/60000 (85%)]	Loss: 0.392881
Train Epoch: 5 [53760/60000 (89%)]	Loss: 0.479417
Train Epoch: 5 [56320/60000 (94%)]	Loss: 0.414116
Train Epoch: 5 [58880/60000 (98%)]	Loss: 0.432079

Test set: Average loss: 0.1801, Accuracy: 9501/10000 (95%)

Train Epoch: 6 [0/60000 (0%)]	Loss: 0.415744
Train Epoch: 6 [2560/60000 (4%)]	Loss: 0.280737
Train Epoch: 6 [5120/60000 (9%)]	Loss: 0.364816
Train Epoch: 6 [7680/60000 (13%)]	Loss: 0.390640
Train Epoch: 6 [10240/60000 (17%)]	Loss: 0.410318
Train Epoch: 6 [12800/60000 (21%)]	Loss: 0.423457
Train Epoch: 6 [15360/60000 (26%)]	Loss: 0.392294
Train Epoch: 6 [17920/60000 (30%)]	Loss: 0.373533
Train Epoch: 6 [20480/60000 (34%)]	Loss: 0.528408
Train Epoch: 6 [23040/60000 (38%)]	Loss: 0.498351
Train Epoch: 6 [25600/60000 (43%)]	Loss: 0.406549
Train Epoch: 6 [28160/60000 (47%)]	Loss: 0.462406
Train Epoch: 6 [30720/60000 (51%)]	Loss: 0.534846
Train Epoch: 6 [33280/60000 (55%)]	Loss: 0.390974
Train Epoch: 6 [35840/60000 (60%)]	Loss: 0.403040
Train Epoch: 6 [38400/60000 (64%)]	Loss: 0.513974
Train Epoch: 6 [40960/60000 (68%)]	Loss: 0.380744
Train Epoch: 6 [43520/60000 (72%)]	Loss: 0.485197
Train Epoch: 6 [46080/60000 (77%)]	Loss: 0.387865
Train Epoch: 6 [48640/60000 (81%)]	Loss: 0.387096
Train Epoch: 6 [51200/60000 (85%)]	Loss: 0.380673
Train Epoch: 6 [53760/60000 (89%)]	Loss: 0.407134
Train Epoch: 6 [56320/60000 (94%)]	Loss: 0.398302
Train Epoch: 6 [58880/60000 (98%)]	Loss: 0.372190

Test set: Average loss: 0.1641, Accuracy: 9549/10000 (95%)

Train Epoch: 7 [0/60000 (0%)]	Loss: 0.356512
Train Epoch: 7 [2560/60000 (4%)]	Loss: 0.393870
Train Epoch: 7 [5120/60000 (9%)]	Loss: 0.402270
Train Epoch: 7 [7680/60000 (13%)]	Loss: 0.353308
Train Epoch: 7 [10240/60000 (17%)]	Loss: 0.358547
Train Epoch: 7 [12800/60000 (21%)]	Loss: 0.319642
Train Epoch: 7 [15360/60000 (26%)]	Loss: 0.438179
Train Epoch: 7 [17920/60000 (30%)]	Loss: 0.386755
Train Epoch: 7 [20480/60000 (34%)]	Loss: 0.404731
Train Epoch: 7 [23040/60000 (38%)]	Loss: 0.322265
Train Epoch: 7 [25600/60000 (43%)]	Loss: 0.372252
Train Epoch: 7 [28160/60000 (47%)]	Loss: 0.381766
Train Epoch: 7 [30720/60000 (51%)]	Loss: 0.390532
Train Epoch: 7 [33280/60000 (55%)]	Loss: 0.396237
Train Epoch: 7 [35840/60000 (60%)]	Loss: 0.285679
Train Epoch: 7 [38400/60000 (64%)]	Loss: 0.355077
Train Epoch: 7 [40960/60000 (68%)]	Loss: 0.241128
Train Epoch: 7 [43520/60000 (72%)]	Loss: 0.426708
Train Epoch: 7 [46080/60000 (77%)]	Loss: 0.456212
Train Epoch: 7 [48640/60000 (81%)]	Loss: 0.376701
Train Epoch: 7 [51200/60000 (85%)]	Loss: 0.365228
Train Epoch: 7 [53760/60000 (89%)]	Loss: 0.365721
Train Epoch: 7 [56320/60000 (94%)]	Loss: 0.266916
Train Epoch: 7 [58880/60000 (98%)]	Loss: 0.443687

Test set: Average loss: 0.1535, Accuracy: 9579/10000 (96%)

Train Epoch: 8 [0/60000 (0%)]	Loss: 0.370897
Train Epoch: 8 [2560/60000 (4%)]	Loss: 0.289111
Train Epoch: 8 [5120/60000 (9%)]	Loss: 0.414931
Train Epoch: 8 [7680/60000 (13%)]	Loss: 0.367614
Train Epoch: 8 [10240/60000 (17%)]	Loss: 0.296232
Train Epoch: 8 [12800/60000 (21%)]	Loss: 0.348647
Train Epoch: 8 [15360/60000 (26%)]	Loss: 0.418467
Train Epoch: 8 [17920/60000 (30%)]	Loss: 0.364306
Train Epoch: 8 [20480/60000 (34%)]	Loss: 0.336696
Train Epoch: 8 [23040/60000 (38%)]	Loss: 0.364676
Train Epoch: 8 [25600/60000 (43%)]	Loss: 0.286585
Train Epoch: 8 [28160/60000 (47%)]	Loss: 0.332353
Train Epoch: 8 [30720/60000 (51%)]	Loss: 0.295880
Train Epoch: 8 [33280/60000 (55%)]	Loss: 0.344958
Train Epoch: 8 [35840/60000 (60%)]	Loss: 0.355939
Train Epoch: 8 [38400/60000 (64%)]	Loss: 0.297799
Train Epoch: 8 [40960/60000 (68%)]	Loss: 0.443442
Train Epoch: 8 [43520/60000 (72%)]	Loss: 0.366912
Train Epoch: 8 [46080/60000 (77%)]	Loss: 0.272624
Train Epoch: 8 [48640/60000 (81%)]	Loss: 0.340495
Train Epoch: 8 [51200/60000 (85%)]	Loss: 0.332022
Train Epoch: 8 [53760/60000 (89%)]	Loss: 0.256738
Train Epoch: 8 [56320/60000 (94%)]	Loss: 0.351117
Train Epoch: 8 [58880/60000 (98%)]	Loss: 0.253743

Test set: Average loss: 0.1426, Accuracy: 9593/10000 (96%)

Train Epoch: 9 [0/60000 (0%)]	Loss: 0.307950
Train Epoch: 9 [2560/60000 (4%)]	Loss: 0.354884
Train Epoch: 9 [5120/60000 (9%)]	Loss: 0.455334
Train Epoch: 9 [7680/60000 (13%)]	Loss: 0.339795
Train Epoch: 9 [10240/60000 (17%)]	Loss: 0.302723
Train Epoch: 9 [12800/60000 (21%)]	Loss: 0.262783
Train Epoch: 9 [15360/60000 (26%)]	Loss: 0.413777
Train Epoch: 9 [17920/60000 (30%)]	Loss: 0.325851
Train Epoch: 9 [20480/60000 (34%)]	Loss: 0.367753
Train Epoch: 9 [23040/60000 (38%)]	Loss: 0.348576
Train Epoch: 9 [25600/60000 (43%)]	Loss: 0.379523
Train Epoch: 9 [28160/60000 (47%)]	Loss: 0.357496
Train Epoch: 9 [30720/60000 (51%)]	Loss: 0.287231
Train Epoch: 9 [33280/60000 (55%)]	Loss: 0.282984
Train Epoch: 9 [35840/60000 (60%)]	Loss: 0.244869
Train Epoch: 9 [38400/60000 (64%)]	Loss: 0.289696
Train Epoch: 9 [40960/60000 (68%)]	Loss: 0.353052
Train Epoch: 9 [43520/60000 (72%)]	Loss: 0.352727
Train Epoch: 9 [46080/60000 (77%)]	Loss: 0.398184
Train Epoch: 9 [48640/60000 (81%)]	Loss: 0.291148
Train Epoch: 9 [51200/60000 (85%)]	Loss: 0.276232
Train Epoch: 9 [53760/60000 (89%)]	Loss: 0.342424
Train Epoch: 9 [56320/60000 (94%)]	Loss: 0.245837
Train Epoch: 9 [58880/60000 (98%)]	Loss: 0.305476

Test set: Average loss: 0.1284, Accuracy: 9632/10000 (96%)

Train Epoch: 10 [0/60000 (0%)]	Loss: 0.216860
Train Epoch: 10 [2560/60000 (4%)]	Loss: 0.325897
Train Epoch: 10 [5120/60000 (9%)]	Loss: 0.341949
Train Epoch: 10 [7680/60000 (13%)]	Loss: 0.270563
Train Epoch: 10 [10240/60000 (17%)]	Loss: 0.369292
Train Epoch: 10 [12800/60000 (21%)]	Loss: 0.424985
Train Epoch: 10 [15360/60000 (26%)]	Loss: 0.280446
Train Epoch: 10 [17920/60000 (30%)]	Loss: 0.291100
Train Epoch: 10 [20480/60000 (34%)]	Loss: 0.314208
Train Epoch: 10 [23040/60000 (38%)]	Loss: 0.253791
Train Epoch: 10 [25600/60000 (43%)]	Loss: 0.304992
Train Epoch: 10 [28160/60000 (47%)]	Loss: 0.288408
Train Epoch: 10 [30720/60000 (51%)]	Loss: 0.239414
Train Epoch: 10 [33280/60000 (55%)]	Loss: 0.431122
Train Epoch: 10 [35840/60000 (60%)]	Loss: 0.208240
Train Epoch: 10 [38400/60000 (64%)]	Loss: 0.291883
Train Epoch: 10 [40960/60000 (68%)]	Loss: 0.281189
Train Epoch: 10 [43520/60000 (72%)]	Loss: 0.297465
Train Epoch: 10 [46080/60000 (77%)]	Loss: 0.253582
Train Epoch: 10 [48640/60000 (81%)]	Loss: 0.311440
Train Epoch: 10 [51200/60000 (85%)]	Loss: 0.389858
Train Epoch: 10 [53760/60000 (89%)]	Loss: 0.389085
Train Epoch: 10 [56320/60000 (94%)]	Loss: 0.314335
Train Epoch: 10 [58880/60000 (98%)]	Loss: 0.248064

Test set: Average loss: 0.1241, Accuracy: 9655/10000 (97%)
View Code

可見整體上同樣。

相關文章
相關標籤/搜索