Pytorch_Part3_模型模塊

時間 2020-05-09

標籤 pytorch part3 模型模塊简体版

原文原文鏈接

VisualPytorch發佈域名+雙服務器以下：
http://nag.visualpytorch.top/static/ (對應114.115.148.27)
http://visualpytorch.top/static/ (對應39.97.209.22)python

1、模型建立與nn.Module

1. 模型建立步驟

torch.nn
nn.Parameter	張量子類，表示可學習參數，如weight, bias
nn.Module	全部網絡層基類，管理網絡屬性
nn.functional	函數具體實現，如卷積，池化，激活函數等
nn.init	參數初始化方法

2. nn.model

屬性api

parameters : 存儲管理nn.Parameter類
modules : 存儲管理nn.Module類
buffers：存儲管理緩衝屬性，如BN層中的running_mean
***_hooks ：存儲管理鉤子函數

調用步驟：服務器

採用步進(Step into)的調試方法從建立網絡模型開始（net =LeNet(classes=2)）進入到每個被調用函數，觀察net的_modules字段什麼時候被構建而且賦值，記錄其中全部進入的類與函數網絡

net = LeNet(classes=2)app

LeNet類 __init__()，super(LeNet, self).__init__()ide

def __init__(self, classes):
        super(LeNet, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16*5*5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, classes)

Module類 __init__(), self._construct()，構造8個有序字典函數

def _construct(self):
        """
        Initializes internal Module state, shared by both nn.Module and ScriptModule.
        """
        torch._C._log_api_usage_once("python.nn_module")
        self._backend = thnn_backend
        self._parameters = OrderedDict()
        self._buffers = OrderedDict()
        self._backward_hooks = OrderedDict()
        self._forward_hooks = OrderedDict()
        self._forward_pre_hooks = OrderedDict()
        self._state_dict_hooks = OrderedDict()
        self._load_state_dict_pre_hooks = OrderedDict()
        self._modules = OrderedDict()

LeNet類：構造卷積層 nn.Conv2d(3, 6, 5)學習

Conv2d類：__init()__，繼承自_ConvNd類，調用父類構造google

def __init__(self, in_channels, out_channels, kernel_size, stride=1,
                 padding=0, dilation=1, groups=1,
                 bias=True, padding_mode='zeros'):
        kernel_size = _pair(kernel_size)
        stride = _pair(stride)
        padding = _pair(padding)
        dilation = _pair(dilation)
        super(Conv2d, self).__init__(
            in_channels, out_channels, kernel_size, stride, padding, dilation,
            False, _pair(0), groups, bias, padding_mode)

_ConvNd類：__init__()，繼承自Module，調用父類構造，同二三步，再進行變量初始化spa

LeNet類：返回至self.conv1 = nn.Conv2d(3, 6, 5)，被父類(nn.Model)__setattr__()函數攔截

# name = 'conv1'
# value = Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1))
 modules = self.__dict__.get('_modules')
            if isinstance(value, Module):
                if modules is None:
                    raise AttributeError(
                        "cannot assign module before Module.__init__() call")
                remove_from(self.__dict__, self._parameters, self._buffers)
                modules[name] = value

於是被記錄到LeNet類的_modules中

繼續構建其餘網絡層，最後獲得的net以下：

總結

一個module能夠包含多個子module
一個module至關於一個運算，必須實現forward()函數
每一個module都有8個字典管理它的屬性

def forward(self, x):
    out = F.relu(self.conv1(x))
    out = F.max_pool2d(out, 2)
    out = F.relu(self.conv2(out))
    out = F.max_pool2d(out, 2)
    out = out.view(out.size(0), -1)
    out = F.relu(self.fc1(out))
    out = F.relu(self.fc2(out))
    out = self.fc3(out)
    return out

2、模型容器與AlexNet構建

nn.Sequential：順序性，各網絡層之間嚴格按順序執行，經常使用於block構建
nn.ModuleList：迭代性，經常使用於大量重複網構建，經過for循環實現重複構建
nn.ModuleDict：索引性，經常使用於可選擇的網絡層

1. 模型容器之Sequential

nn.Sequential 是 nn.module的容器，用於按順序包裝一組網絡層

順序性：各網絡層之間嚴格按照順序構建
自帶forward()：自帶的forward裏，經過for循環依次執行前向傳播運算

class LeNetSequential(nn.Module):
    def __init__(self, classes):
        super(LeNetSequential, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 6, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),
            nn.Conv2d(6, 16, 5),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2),)

        '''
        或者以下：給每層起名，默認以序號排名
        self.features = nn.Sequential(OrderedDict({
            'conv1': nn.Conv2d(3, 6, 5),
            'relu1': nn.ReLU(inplace=True),
            'pool1': nn.MaxPool2d(kernel_size=2, stride=2),

            'conv2': nn.Conv2d(6, 16, 5),
            'relu2': nn.ReLU(inplace=True),
            'pool2': nn.MaxPool2d(kernel_size=2, stride=2),
        }))
        '''

        self.classifier = nn.Sequential(
            nn.Linear(16*5*5, 120),
            nn.ReLU(),
            nn.Linear(120, 84),
            nn.ReLU(),
            nn.Linear(84, classes),)

    def forward(self, x):
        x = self.features(x)
        x = x.view(x.size()[0], -1)
        x = self.classifier(x)
        return x

調用步驟

LeNetSequential.__init__()

Sequential.__init__()

def __init__(self, *args):
        super(Sequential, self).__init__()
        if len(args) == 1 and isinstance(args[0], OrderedDict):
            for key, module in args[0].items():
                self.add_module(key, module)
        else:
            for idx, module in enumerate(args):
                self.add_module(str(idx), module)

Model.__init__()
Sequential.add_module() : self._modules[name] = module
LeNetSequential 中將Sequential賦值過程被 __setattr__() 攔截，而一樣也是Model，被設爲_models的一部分

2. 模型容器之ModuleList

nn.ModuleList是 nn.module的容器，用於包裝一組網絡層，以迭代方式調用網絡層
主要方法：

append()：在ModuleList後面添加網絡層
extend()：拼接兩個ModuleList
insert()：指定在ModuleList中位置插入網絡層

class ModuleList(nn.Module):
    def __init__(self):
        super(ModuleList, self).__init__()
        self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(20)]) # 僅一行代碼實現20層10單元全鏈接

    def forward(self, x):
        for i, linear in enumerate(self.linears):
            x = linear(x)
        return x

其中，ModuleList.__init__()

def __init__(self, modules=None):
        super(ModuleList, self).__init__()
        if modules is not None:
            self += modules

3. 模型容器之ModuleDict

nn.ModuleDict是 nn.module的容器，用於包裝一組網絡層，以索引方式調用網絡層
主要方法：

clear()：清空ModuleDict
items()：返回可迭代的鍵值對(key-value pairs)
keys()：返回字典的鍵(key)
values()：返回字典的值(value)
pop()：返回一對鍵值，並從字典中刪除

class ModuleDict(nn.Module):
    def __init__(self):
        super(ModuleDict, self).__init__()
        self.choices = nn.ModuleDict({
            'conv': nn.Conv2d(10, 10, 3),
            'pool': nn.MaxPool2d(3)
        })

        self.activations = nn.ModuleDict({
            'relu': nn.ReLU(),
            'prelu': nn.PReLU()
        })

    def forward(self, x, choice, act):
        x = self.choices[choice](x)
        x = self.activations[act](x)
        return x

其中，每個ModuleDict模塊至關於多路選擇器，在輸入時要指定通路：

net = ModuleDict()
fake_img = torch.randn((4, 10, 32, 32))
output = net(fake_img, 'conv', 'relu')

4. AlexNet構建

AlexNet：2012年以高出第二名10多個百分點的準確率得到ImageNet分類任務冠
軍，開創了卷積神經網絡的新時代
AlexNet特色以下：

採用ReLU：替換飽和激活函數，減輕梯度消失
採用LRN(Local Response Normalization)：對數據歸一化，減輕梯度消失
Dropout：提升全鏈接層的魯棒性，增長網絡的泛化能力
Data Augmentation：TenCrop，色彩修改

構建：使用了Sequential和其自帶的forward()方法

class AlexNet(nn.Module):

    def __init__(self, num_classes=1000):
        super(AlexNet, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(64, 192, kernel_size=5, padding=2),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(192, 384, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(384, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.Conv2d(256, 256, kernel_size=3, padding=1),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        '''
        這樣命名
        self.features = nn.Sequential(
            'conv1': nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2),
            'relu1': nn.ReLU(inplace=True),
            'pool1': nn.MaxPool2d(kernel_size=3, stride=2),
            'conv2': nn.Conv2d(64, 192, kernel_size=5, padding=2),
            'relu2': nn.ReLU(inplace=True),
            'pool2': nn.MaxPool2d(kernel_size=3, stride=2),
            'conv3': nn.Conv2d(192, 384, kernel_size=3, padding=1),
            'relu3': nn.ReLU(inplace=True),
            'conv4': nn.Conv2d(384, 256, kernel_size=3, padding=1),
            'relu4': nn.ReLU(inplace=True),
            'conv5': nn.Conv2d(256, 256, kernel_size=3, padding=1),
            'relu5': nn.ReLU(inplace=True),
            'pool5': nn.MaxPool2d(kernel_size=3, stride=2),
        )
        '''
        self.avgpool = nn.AdaptiveAvgPool2d((6, 6))
        self.classifier = nn.Sequential(
            nn.Dropout(),
            nn.Linear(256 * 6 * 6, 4096),
            nn.ReLU(inplace=True),
            nn.Dropout(),
            nn.Linear(4096, 4096),
            nn.ReLU(inplace=True),
            nn.Linear(4096, num_classes),
        )

    def forward(self, x):
        x = self.features(x)
        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.classifier(x)
        return x

一樣在torchvision/models 下還有googlenet resnet 等經典網絡的構建。

3、卷積層

1. 1d/2d/3d卷積

卷積運算：卷積核在輸入信號（圖像）上滑動，相應位置上進行乘加
卷積核：又稱爲濾波器，過濾器，可認爲是某種模式，某種特徵。
卷積過程相似於用一個模版去圖像上尋找與它類似的區域，與卷積核模式越類似，激活值越高，從而實現特徵提取（邊緣，條紋，色彩這一些細節模式）

卷積維度：通常狀況下，卷積核在幾個維度上滑動，就是幾維卷積

2. nn.Conv2d

nn.Conv2d(	in_channels,	# 輸入通道數
            out_channels,	# 輸出通道數，等價於卷積核個數
            kernel_size,	# 卷積核尺寸
            stride=1,		# 步長
            padding=0,		# 填充個數
            dilation=1,		# 空洞卷積大小
            groups=1,		# 分組卷積設置
            bias=True,		# 偏置
            padding_mode='zeros')

功能：對多個二維信號進行二維卷積
主要參數：

dilation：

groups：

尺寸計算：

set_seed(3)  # 設置隨機種子

# =================== load img ============
path_img = os.path.join("lena.png")
img = Image.open(path_img).convert('RGB')  # 0~255

# convert to tensor
img_transform = transforms.Compose([transforms.ToTensor()])
img_tensor = img_transform(img)
img_tensor.unsqueeze_(dim=0)    # C*H*W to B*C*H*W

conv_layer = nn.Conv2d(3, 1, 3)   # input:(i, o, size) weights:(o, i , h, w)
nn.init.xavier_normal_(conv_layer.weight.data)

# calculation
img_conv = conv_layer(img_tensor)

不一樣的卷積核，運算結果不一樣:

同時卷積過程當中，尺寸發生變化：

卷積前尺寸:torch.Size([1, 3, 512, 512])
卷積後尺寸:torch.Size([1, 1, 510, 510])

其中Conv2d對應Parameter是四維張量，進行二維卷積操做。大小是[1,3,3,3]（表示1個輸出通道（卷積核個數），3個Channel，卷積核大小爲3*3）

卷積過程以下：

3. 轉置卷積

nn.ConvTranspose2d(	in_channels,
                    out_channels,
                    kernel_size,
                    stride=1,
                    padding=0,
                    output_padding=0,
                    groups=1,
                    bias=True,
                    dilation=1,
                    padding_mode='zeros')

轉置卷積又稱部分跨越卷積(Fractionallystrided Convolution) ，用於對圖像進行上採樣(UpSample)

注意：雖然轉置卷積覈對應的矩陣與卷積覈對應的矩陣形狀上乘轉置關係，但數值上徹底無關，即爲不可逆過程。

conv_layer = nn.ConvTranspose2d(3, 1, 3, stride=2)   # input:(i, o, size)

# 卷積前尺寸:torch.Size([1, 3, 512, 512])
# 卷積後尺寸:torch.Size([1, 1, 1025, 1025])

圖像尺寸變大，出現大量空格，稱之爲轉置卷積的
[棋盤效應]: https://www.jianshu.com/p/36ff39344de5

4、nn網絡層-池化-線性-激活函數

1. 池化層

最大值/平均值

nn.MaxPool2d(kernel_size,
			stride=None,
            padding=0,
            dilation=1, 			# 池化核間隔大小
            return_indices=False, 	# 記錄池化像素索引
            ceil_mode=False			# 尺寸向上取整
            )
nn.AvgPool2d(kernel_size,
            stride=None,
            padding=0,
            ceil_mode=False,
            count_include_pad=True, 	# 填充值用於計算
            divisor_override=None		# 除法因子，除的再也不是核的大小
            )

池化運算：對信號進行「收集」（多變少）並「總結」，相似水池收集水資源，於是
得名池化層

反池化

nn.MaxUnpool2d(	kernel_size,
                stride=None,
                padding=0
              )
forward(self, input, indices, output_size=None)

功能：對二維信號（圖像）進行最大值池化上採樣

2. 線性層

nn.Linear(in_features, 		# 輸入結點數
			out_features, 	# 輸出結點數
			bias=True)

線性層又稱全鏈接層，其每一個神經元與上一層全部神經元相連，實現對前一層的線性組合，線性變換

Input = [1, 2, 3] shape = (1, 3)
W_0 = 𝟏𝟏𝟏𝟐𝟐𝟐𝟑𝟑𝟑𝟒𝟒𝟒
shape = (3, 4)
Hidden = Input * W_0 shape = (1, 4) = [6, 12, 18, 24]

3. 激活函數層

激活函數對特徵進行非線性變換，賦予多層神經網絡具備深度的意義

nn.Sigmoid
計算公式：\(y = \frac{1}{1+e^{-x}}\)
梯度公式：𝒚′ = 𝒚 ∗ 𝟏 − 𝒚
特性：

輸出值在(0,1)，符合機率
導數範圍是[0, 0.25],易致使梯度消失
輸出爲非0均值，破壞數據分佈

nn.tanh
計算公式：𝐲 =\(\frac{sinh x}{cosh x} = \frac{e^x-e^{-x}}{e^x+e^{-x}}=\frac{2}{1+e^{-2x}}+1\)
梯度公式：𝒚′ = 𝟏 − y 𝟐
特性：

輸出值在(-1,1)，數據符合0均值
導數範圍是(0, 1),易致使梯度消失

nn.ReLU
計算公式：𝐲 = max(𝟎, 𝒙)
梯度公式：𝒚′ = 𝟏, 𝒙 > 𝟎
𝒖𝒏𝒅𝒆𝒇 𝒊𝒏𝒆𝒅, 𝒙 = 𝟎
𝟎, 𝒙 < 𝟎
特性：

輸出值均爲正數，負半軸致使死神經元
導數是1,緩解梯度消失，但易引起梯度爆炸

nn.LeakyReLU

negative_slope: 負半軸斜率

nn.PReLU

init: 可學習斜率

nn.RReLU

lower: 均勻分佈下限
upper:均勻分佈上限

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。