VisualPytorch發佈域名+雙服務器以下:
http://nag.visualpytorch.top/static/ (對應114.115.148.27)
http://visualpytorch.top/static/ (對應39.97.209.22)python
torch.nn | |
---|---|
nn.Parameter | 張量子類,表示可學習參數,如weight, bias |
nn.Module | 全部網絡層基類,管理網絡屬性 |
nn.functional | 函數具體實現,如卷積,池化,激活函數等 |
nn.init | 參數初始化方法 |
屬性api
調用步驟:服務器
採用步進(Step into)的調試方法從建立網絡模型開始(net =LeNet(classes=2)
)進入到每個被調用函數,觀察net的_modules字段什麼時候被構建而且賦值,記錄其中全部進入的類與函數網絡
net = LeNet(classes=2)app
LeNet類 __init__(),super(LeNet, self).__init__()
ide
def __init__(self, classes): super(LeNet, self).__init__() self.conv1 = nn.Conv2d(3, 6, 5) self.conv2 = nn.Conv2d(6, 16, 5) self.fc1 = nn.Linear(16*5*5, 120) self.fc2 = nn.Linear(120, 84) self.fc3 = nn.Linear(84, classes)
Module類 __init__(), self._construct()
,構造8個有序字典函數
def _construct(self): """ Initializes internal Module state, shared by both nn.Module and ScriptModule. """ torch._C._log_api_usage_once("python.nn_module") self._backend = thnn_backend self._parameters = OrderedDict() self._buffers = OrderedDict() self._backward_hooks = OrderedDict() self._forward_hooks = OrderedDict() self._forward_pre_hooks = OrderedDict() self._state_dict_hooks = OrderedDict() self._load_state_dict_pre_hooks = OrderedDict() self._modules = OrderedDict()
LeNet類:構造卷積層 nn.Conv2d(3, 6, 5)
學習
Conv2d類:__init()__
,繼承自_ConvNd類,調用父類構造google
def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros'): kernel_size = _pair(kernel_size) stride = _pair(stride) padding = _pair(padding) dilation = _pair(dilation) super(Conv2d, self).__init__( in_channels, out_channels, kernel_size, stride, padding, dilation, False, _pair(0), groups, bias, padding_mode)
_ConvNd類:__init__()
,繼承自Module,調用父類構造,同二三步,再進行變量初始化spa
LeNet類:返回至self.conv1 = nn.Conv2d(3, 6, 5)
,被父類(nn.Model)__setattr__()
函數攔截
# name = 'conv1' # value = Conv2d(3, 6, kernel_size=(5, 5), stride=(1, 1)) modules = self.__dict__.get('_modules') if isinstance(value, Module): if modules is None: raise AttributeError( "cannot assign module before Module.__init__() call") remove_from(self.__dict__, self._parameters, self._buffers) modules[name] = value
於是被記錄到LeNet類的_modules中
繼續構建其餘網絡層,最後獲得的net以下:
總結
def forward(self, x): out = F.relu(self.conv1(x)) out = F.max_pool2d(out, 2) out = F.relu(self.conv2(out)) out = F.max_pool2d(out, 2) out = out.view(out.size(0), -1) out = F.relu(self.fc1(out)) out = F.relu(self.fc2(out)) out = self.fc3(out) return out
nn.Sequential:順序性,各網絡層之間嚴格按順序執行,經常使用於block構建
nn.ModuleList:迭代性,經常使用於大量重複網構建,經過for循環實現重複構建
nn.ModuleDict:索引性,經常使用於可選擇的網絡層
nn.Sequential 是 nn.module的容器,用於按順序包裝一組網絡層
class LeNetSequential(nn.Module): def __init__(self, classes): super(LeNetSequential, self).__init__() self.features = nn.Sequential( nn.Conv2d(3, 6, 5), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2), nn.Conv2d(6, 16, 5), nn.ReLU(), nn.MaxPool2d(kernel_size=2, stride=2),) ''' 或者以下:給每層起名,默認以序號排名 self.features = nn.Sequential(OrderedDict({ 'conv1': nn.Conv2d(3, 6, 5), 'relu1': nn.ReLU(inplace=True), 'pool1': nn.MaxPool2d(kernel_size=2, stride=2), 'conv2': nn.Conv2d(6, 16, 5), 'relu2': nn.ReLU(inplace=True), 'pool2': nn.MaxPool2d(kernel_size=2, stride=2), })) ''' self.classifier = nn.Sequential( nn.Linear(16*5*5, 120), nn.ReLU(), nn.Linear(120, 84), nn.ReLU(), nn.Linear(84, classes),) def forward(self, x): x = self.features(x) x = x.view(x.size()[0], -1) x = self.classifier(x) return x
調用步驟
LeNetSequential.__init__()
Sequential.__init__()
def __init__(self, *args): super(Sequential, self).__init__() if len(args) == 1 and isinstance(args[0], OrderedDict): for key, module in args[0].items(): self.add_module(key, module) else: for idx, module in enumerate(args): self.add_module(str(idx), module)
Model.__init__()
Sequential.add_module()
: self._modules[name] = module
LeNetSequential
中將Sequential賦值過程被 __setattr__()
攔截,而一樣也是Model
,被設爲_models的一部分
nn.ModuleList是 nn.module的容器,用於包裝一組網絡層,以迭代方式調用網絡層
主要方法:
class ModuleList(nn.Module): def __init__(self): super(ModuleList, self).__init__() self.linears = nn.ModuleList([nn.Linear(10, 10) for i in range(20)]) # 僅一行代碼實現20層10單元全鏈接 def forward(self, x): for i, linear in enumerate(self.linears): x = linear(x) return x
其中,ModuleList.__init__()
def __init__(self, modules=None): super(ModuleList, self).__init__() if modules is not None: self += modules
nn.ModuleDict是 nn.module的容器,用於包裝一組網絡層,以索引方式調用網絡層
主要方法:
class ModuleDict(nn.Module): def __init__(self): super(ModuleDict, self).__init__() self.choices = nn.ModuleDict({ 'conv': nn.Conv2d(10, 10, 3), 'pool': nn.MaxPool2d(3) }) self.activations = nn.ModuleDict({ 'relu': nn.ReLU(), 'prelu': nn.PReLU() }) def forward(self, x, choice, act): x = self.choices[choice](x) x = self.activations[act](x) return x
其中,每個ModuleDict
模塊至關於多路選擇器,在輸入時要指定通路:
net = ModuleDict() fake_img = torch.randn((4, 10, 32, 32)) output = net(fake_img, 'conv', 'relu')
AlexNet:2012年以高出第二名10多個百分點的準確率得到ImageNet分類任務冠
軍,開創了卷積神經網絡的新時代
AlexNet特色以下:
構建:使用了Sequential和其自帶的forward()方法
class AlexNet(nn.Module): def __init__(self, num_classes=1000): super(AlexNet, self).__init__() self.features = nn.Sequential( nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=3, stride=2), nn.Conv2d(64, 192, kernel_size=5, padding=2), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=3, stride=2), nn.Conv2d(192, 384, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(384, 256, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.Conv2d(256, 256, kernel_size=3, padding=1), nn.ReLU(inplace=True), nn.MaxPool2d(kernel_size=3, stride=2), ) ''' 這樣命名 self.features = nn.Sequential( 'conv1': nn.Conv2d(3, 64, kernel_size=11, stride=4, padding=2), 'relu1': nn.ReLU(inplace=True), 'pool1': nn.MaxPool2d(kernel_size=3, stride=2), 'conv2': nn.Conv2d(64, 192, kernel_size=5, padding=2), 'relu2': nn.ReLU(inplace=True), 'pool2': nn.MaxPool2d(kernel_size=3, stride=2), 'conv3': nn.Conv2d(192, 384, kernel_size=3, padding=1), 'relu3': nn.ReLU(inplace=True), 'conv4': nn.Conv2d(384, 256, kernel_size=3, padding=1), 'relu4': nn.ReLU(inplace=True), 'conv5': nn.Conv2d(256, 256, kernel_size=3, padding=1), 'relu5': nn.ReLU(inplace=True), 'pool5': nn.MaxPool2d(kernel_size=3, stride=2), ) ''' self.avgpool = nn.AdaptiveAvgPool2d((6, 6)) self.classifier = nn.Sequential( nn.Dropout(), nn.Linear(256 * 6 * 6, 4096), nn.ReLU(inplace=True), nn.Dropout(), nn.Linear(4096, 4096), nn.ReLU(inplace=True), nn.Linear(4096, num_classes), ) def forward(self, x): x = self.features(x) x = self.avgpool(x) x = torch.flatten(x, 1) x = self.classifier(x) return x
一樣在torchvision/models
下還有googlenet
resnet
等經典網絡的構建。
卷積運算:卷積核在輸入信號(圖像)上滑動,相應位置上進行乘加
卷積核:又稱爲濾波器,過濾器,可認爲是某種模式,某種特徵。
卷積過程相似於用一個模版去圖像上尋找與它類似的區域,與卷積核模式越類似,激活值越高,從而實現特徵提取(邊緣,條紋,色彩這一些細節模式)
卷積維度:通常狀況下,卷積核在幾個維度上滑動,就是幾維卷積
nn.Conv2d( in_channels, # 輸入通道數 out_channels, # 輸出通道數,等價於卷積核個數 kernel_size, # 卷積核尺寸 stride=1, # 步長 padding=0, # 填充個數 dilation=1, # 空洞卷積大小 groups=1, # 分組卷積設置 bias=True, # 偏置 padding_mode='zeros')
功能:對多個二維信號進行二維卷積
主要參數:
尺寸計算:
set_seed(3) # 設置隨機種子 # =================== load img ============ path_img = os.path.join("lena.png") img = Image.open(path_img).convert('RGB') # 0~255 # convert to tensor img_transform = transforms.Compose([transforms.ToTensor()]) img_tensor = img_transform(img) img_tensor.unsqueeze_(dim=0) # C*H*W to B*C*H*W conv_layer = nn.Conv2d(3, 1, 3) # input:(i, o, size) weights:(o, i , h, w) nn.init.xavier_normal_(conv_layer.weight.data) # calculation img_conv = conv_layer(img_tensor)
不一樣的卷積核,運算結果不一樣:
同時卷積過程當中,尺寸發生變化:
卷積前尺寸:torch.Size([1, 3, 512, 512]) 卷積後尺寸:torch.Size([1, 1, 510, 510])
其中Conv2d對應Parameter是四維張量,進行二維卷積操做。大小是[1,3,3,3](表示1個輸出通道(卷積核個數),3個Channel,卷積核大小爲3*3)
卷積過程以下:
nn.ConvTranspose2d( in_channels, out_channels, kernel_size, stride=1, padding=0, output_padding=0, groups=1, bias=True, dilation=1, padding_mode='zeros')
轉置卷積又稱部分跨越卷積(Fractionallystrided Convolution) ,用於對圖像進行上採樣(UpSample)
注意:雖然轉置卷積覈對應的矩陣與卷積覈對應的矩陣形狀上乘轉置關係,但數值上徹底無關,即爲不可逆過程。
conv_layer = nn.ConvTranspose2d(3, 1, 3, stride=2) # input:(i, o, size) # 卷積前尺寸:torch.Size([1, 3, 512, 512]) # 卷積後尺寸:torch.Size([1, 1, 1025, 1025])
圖像尺寸變大,出現大量空格,稱之爲轉置卷積的
[棋盤效應]: https://www.jianshu.com/p/36ff39344de5
最大值/平均值
nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1, # 池化核間隔大小 return_indices=False, # 記錄池化像素索引 ceil_mode=False # 尺寸向上取整 ) nn.AvgPool2d(kernel_size, stride=None, padding=0, ceil_mode=False, count_include_pad=True, # 填充值用於計算 divisor_override=None # 除法因子,除的再也不是核的大小 )
池化運算:對信號進行 「收集」(多變少)並「總結」,相似水池收集水資源,於是
得名池化層
反池化
nn.MaxUnpool2d( kernel_size, stride=None, padding=0 ) forward(self, input, indices, output_size=None)
功能:對二維信號(圖像)進行最大值池化上採樣
nn.Linear(in_features, # 輸入結點數 out_features, # 輸出結點數 bias=True)
線性層又稱全鏈接層,其每一個神經元與上一層全部神經元相連,實現對前一層的線性組合,線性變換
Input = [1, 2, 3] shape = (1, 3) W_0 = 𝟏𝟏𝟏𝟐𝟐𝟐𝟑𝟑𝟑𝟒𝟒𝟒 shape = (3, 4) Hidden = Input * W_0 shape = (1, 4) = [6, 12, 18, 24]
激活函數對特徵進行非線性變換,賦予多層神經網絡具備深度的意義
nn.Sigmoid
計算公式:\(y = \frac{1}{1+e^{-x}}\)
梯度公式:𝒚′ = 𝒚 ∗ 𝟏 − 𝒚
特性:
nn.tanh
計算公式:𝐲 =\(\frac{sinh x}{cosh x} = \frac{e^x-e^{-x}}{e^x+e^{-x}}=\frac{2}{1+e^{-2x}}+1\)
梯度公式:𝒚′ = 𝟏 − y 𝟐
特性:
nn.ReLU
計算公式:𝐲 = max(𝟎, 𝒙)
梯度公式:𝒚′ = 𝟏, 𝒙 > 𝟎
𝒖𝒏𝒅𝒆𝒇 𝒊𝒏𝒆𝒅, 𝒙 = 𝟎
𝟎, 𝒙 < 𝟎
特性:
nn.LeakyReLU
nn.PReLU
nn.RReLU