siftflow-fcn32s訓練及預測

時間 2019-11-12

標籤 siftflow fcn32s fcn 訓練預測简体版

原文原文鏈接

1、說明python

SIFT Flow 是一個標註的語義分割的數據集，有兩個label，一個是語義分類（33類），另外一個是場景標籤（3類）。git

Semantic and geometric segmentation classes for scenes.

Semantic: 0 is void and 1–33 are classes.

01 awning
02 balcony
03 bird
04 boat
05 bridge
06 building
07 bus
08 car
09 cow
10 crosswalk
11 desert
12 door
13 fence
14 field
15 grass
16 moon
17 mountain
18 person
19 plant
20 pole
21 river
22 road
23 rock
24 sand
25 sea
26 sidewalk
27 sign
28 sky
29 staircase
30 streetlight
31 sun
32 tree
33 window

Geometric: -1 is void and 1–3 are classes.

01 sky
02 horizontal
03 vertical

2、模型訓練github

一、源碼下載網絡

git clone git@github.com:shelhamer/fcn.berkeleyvision.org.git

二、數據準備ide

下載標註好的SiftFlowDataset.zip數據集，地址：http://www.cs.unc.edu/~jtighe/Papers/ECCV10/siftflow/SiftFlowDataset.zip測試

將壓縮包解壓至data/sift-flow文件夾下。ui

三、代碼修改lua

git clone git@github.com:litingpan/fcn.git

或從https://github.com/litingpan/fcn 下載，替換掉siftflow-fcn32s整個文件夾。spa

其中solve.py修改以下：.net

import caffe
import surgery, score

import numpy as np
import os
import sys

try:
    import setproctitle
    setproctitle.setproctitle(os.path.basename(os.getcwd()))
except:
    pass

# weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'
vgg_weights = '../ilsvrc-nets/VGG_ILSVRC_16_layers.caffemodel'
vgg_proto = '../ilsvrc-nets/VGG_ILSVRC_16_layers_deploy.prototxt'

# init
# caffe.set_device(int(sys.argv[1]))
caffe.set_device(0)
caffe.set_mode_gpu()

# solver = caffe.SGDSolver('solver.prototxt')
# solver.net.copy_from(weights)
solver = caffe.SGDSolver('solver.prototxt')
vgg_net = caffe.Net(vgg_proto, vgg_weights, caffe.TRAIN)
surgery.transplant(solver.net, vgg_net)
del vgg_net

# surgeries
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)

# scoring
test = np.loadtxt('../data/sift-flow/test.txt', dtype=str)

for _ in range(50):
    solver.step(2000)
    # N.B. metrics on the semantic labels are off b.c. of missing classes;
    # score manually from the histogram instead for proper evaluation
    score.seg_tests(solver, False, test, layer='score_sem', gt='sem')
    score.seg_tests(solver, False, test, layer='score_geo', gt='geo')

四、下載預訓練模型

Revisions · ILSVRC-2014 model (VGG team) with 16 weight layers https://gist.github.com/ksimonyan/211839e770f7b538e2d8/revisions

同時下載VGG_ILSVRC_16_layers.caffemodel和VGG_ILSVRC_16_layers_deploy.prototxt放在ilsvrc-nets目錄下

五、訓練

python solve.py

訓練完成後，在snapshot目錄下train_iter_100000.caffemodel即爲訓練好的模型。

3、預測

一、模型準備

能夠使用咱們前面訓練好的模型，若是不想本身訓練，則能夠直接下載訓練好的模型http://dl.caffe.berkeleyvision.org/siftflow-fcn32s-heavy.caffemodel

二、deploy.prototxt

由test.prototxt修改過來的，主要修改了有三個地方，

（1）輸入層

layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param {
    # These dimensions are purely for sake of example;
    # see infer.py for how to reshape the net to the given input size.
    shape { dim: 1 dim: 3 dim: 256 dim: 256 }
  }
}

注意Input中，要與被測圖片的尺寸一致。

（2）刪掉了drop層

（3）刪除了含有loss層相關層

三、infer.py

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt 
import sys   
import caffe

# the demo image is "2007_000129" from PASCAL VOC

# load image, switch to BGR, subtract mean, and make dims C x H x W for Caffe
im = Image.open('coast_bea14.jpg')
in_ = np.array(im, dtype=np.float32)
in_ = in_[:,:,::-1]
in_ -= np.array((104.00698793,116.66876762,122.67891434))
in_ = in_.transpose((2,0,1))

# load net
net = caffe.Net('deploy.prototxt', 'snapshot/train_iter_100000.caffemodel', caffe.TEST)
# shape for input (data blob is N x C x H x W), set data
net.blobs['data'].reshape(1, *in_.shape)
net.blobs['data'].data[...] = in_
# run net and take argmax for prediction
net.forward()
sem_out = net.blobs['score_sem'].data[0].argmax(axis=0)
   
# plt.imshow(out,cmap='gray');
plt.imshow(sem_out)
plt.axis('off')
plt.savefig('coast_bea14_sem_out.png')
sem_out_img = Image.fromarray(sem_out.astype('uint8')).convert('RGB')
sem_out_img.save('coast_bea14_sem_img_out.png')

geo_out = net.blobs['score_geo'].data[0].argmax(axis=0)
plt.imshow(geo_out)
plt.axis('off')
plt.savefig('coast_bea14_geo_out.png')
geo_out_img = Image.fromarray(geo_out.astype('uint8')).convert('RGB')
geo_out_img.save('coast_bea14_geo_img_out.png')