conv2d處理的數據是什麼樣的？

時間 2021-08-14

標籤 node shell markdown 數據結構 ide 函數動畫 code blog three 欄目 Unix 简体版

原文原文鏈接

對單通道數據進行卷積
node

若是咱們訓練的數據是單通道照片，那麼一個樣本其數據結構是二維矩陣。shell

由於訓練模型時，通常都是使用小批量（n個樣本）屢次對模型進行訓練。那麼這一個批次的數據就是三維結構（多個二維數據組成三維數據）。markdown

那麼整個數據集由於分紅了不少個小批量數據。最終整個數據集應該組織成一個四維數據結構(多個三維數據組成思惟數據)。數據結構

如今咱們有一個照片數據集，這個數據集只有一張照片。ide

假設該照片是單通道照片，那麼咱們按照上面的推理，應該把這一張照片組織成四維數據函數

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all' #顯示多個變量結果
import numpy as np
import torch.nn as nn
import torch as t

#單通道是二維數據
img1 = np.array([[3,3,2,1,0],
                [0,0,1,3,1],
                [3,1,2,2,3],
                [2,0,0,2,2],
                [2,0,0,0,1]])

#5*5,說明是二維的
img1.shape   
img1_tensor = t.Tensor(one_channel_data)
img1_tensor

運行結果動畫

(5, 5)

tensor([[3., 3., 2., 1., 0.],
        [0., 0., 1., 3., 1.],
        [3., 1., 2., 2., 3.],
        [2., 0., 0., 2., 2.],
        [2., 0., 0., 0., 1.]])

該照片img1是單通道照片，也就是二維數據。(5,5)表示圖片長4，寬5。很簡單，（5,5）一看就是二維。code

以前咱們也討論了，多個二維數據組成三維，怎麼把二維變成三維數據。方法有兩種：blog

將二維數據放到列表中，而後在將其轉化爲torch.Tensorthree
使用torch中的unsqueeze(0)方法,將維度增長一個維度

列表法

img2 = np.array([[3,3,2,1,0],
                [0,0,1,3,1],
                [3,1,2,2,3],
                [2,0,0,2,2],
                [2,0,0,0,1]])
three_dims = [img2]
three_dims = t.Tensor(three_dims)
three_dims1.shape
three_dims1

運行結果

torch.Size([1, 5, 5])

tensor([[[3., 3., 2., 1., 0.],
         [0., 0., 1., 3., 1.],
         [3., 1., 2., 2., 3.],
         [2., 0., 0., 2., 2.],
         [2., 0., 0., 0., 1.]]])

(1,5,5)元組中有三個數，因此咱們成功將二維變成三維數據

unsqueeze(0)方法

img3 = np.array([[3,3,2,1,0],
                 [0,0,1,3,1],
                 [3,1,2,2,3],
                 [2,0,0,2,2],
                 [2,0,0,0,1]])

img3_tensor = t.Tensor(img3)
three_dims2 = img3_tensor.unsqueeze(0)
three_dims2.shape
three_dims2

運行結果

torch.Size([1, 5, 5])

tensor([[[3., 3., 2., 1., 0.],
         [0., 0., 1., 3., 1.],
         [3., 1., 2., 2., 3.],
         [2., 0., 0., 2., 2.],
         [2., 0., 0., 0., 1.]]])

使用squeeze(0)也成功的將二維數據轉化爲三維數據。同理，多個三維數據組成四維數據，這裏咱們能夠依然使用squeeze(0)將三維轉換爲四維

img3 = np.array([[3,3,2,1,0],
                 [0,0,1,3,1],
                 [3,1,2,2,3],
                 [2,0,0,2,2],
                 [2,0,0,0,1]])

img3_tensor = t.Tensor(img3)
four_dims = img3_tensor.unsqueeze(0).unsqueeze(0)
four_dims.shape
four_dims

運行結果

torch.Size([1, 1, 5, 5])

tensor([[[[3., 3., 2., 1., 0.],
          [0., 0., 1., 3., 1.],
          [3., 1., 2., 2., 3.],
          [2., 0., 0., 2., 2.],
          [2., 0., 0., 0., 1.]]]])

(1, 1, 5, 5)元組中有四個數，因此咱們成功將二維變成四維數據

接下來，咱們使用卷積方法對這四維數據，即只有一張照片的數據集進行訓練

同時如圖，咱們能夠知道卷積核尺寸kernel_size = 3 , 卷積先後均爲單通道數據，

所以 in_channels=1, out_channels=1。用pytorch實現動畫中的卷積操做，代碼以下：

one_img = np.array([[3,3,2,1,0],
                    [0,0,1,3,1],
                    [3,1,2,2,3],
                    [2,0,0,2,2],
                    [2,0,0,0,1]])

#將二維數據轉化爲四維數據
one_data = t.Tensor(one_img).unsqueeze(0).unsqueeze(0)

#(1, 1, 5, 5) 成功轉化爲四維數據
one_data.shape  

#卷積conv2d須要輸入四維數據，其中輸入和輸出均爲單通道，卷積核爲3
conv = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3)

conv(one_data)

成功使用了con2d函數，說明咱們組織的數據形式是正確的。運行結果以下

torch.Size([1, 1, 5, 5])

tensor([[[[-0.4410, -0.7628, -0.6724],
          [-0.1663,  0.0914, -0.7396],
          [ 0.5701, -0.1886, -0.2795]]]], grad_fn=<ThnnConv2DBackward>)

對多通道數據進行卷積

這裏咱們假設照片是三通道，

from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = 'all' #顯示多個變量結果
import numpy as np
import torch.nn as nn
import torch as t

chanel_1 = [[3,3,2,1,0],
            [0,0,1,3,1],
            [3,1,2,2,3],
            [2,0,0,2,2],
            [2,0,0,0,1]]

chanel_2 = [[3,3,2,1,0],
            [0,0,3,3,1],
            [0,1,2,2,3],
            [2,2,0,2,2],
            [2,0,1,0,1]]

chanel_3 = [[3,3,2,1,0],
            [0,0,3,3,1],
            [3,3,2,2,3],
            [2,2,0,2,2],
            [2,5,0,0,1]]

#僞造一張照片，該照片是三通道數據
img_data2 = t.Tensor([chanel_1, 
                     chanel_2, 
                     chanel_3])

#三通道數據若是輸入到conv2d中應該是四維數據。
img_data2 = img_data2.unsqueeze(0)
#(1, 1, 5, 5) 成功轉化爲四維數據
img_data2.shape  

#卷積conv2d須要輸入四維數據，其中輸入爲三通道in_channels=3，假設咱們想輸出單通道，則out_channels=1
conv = nn.Conv2d(in_channels=3, out_channels=1, kernel_size=3)

conv(img_data2)

成功使用了con2d函數，說明咱們組織的數據形式是正確的。運行結果以下

torch.Size([1, 3, 5, 5])

tensor([[[[ 1.0329,  0.9904, -0.1885],
          [ 0.0870,  0.2805,  0.3501],
          [ 0.4509, -0.1977,  0.9282]]]], grad_fn=<ThnnConv2DBackward>)