Summary on deep learning framework --- Torch7

時間 2019-11-05

標籤 summary deep learning framework torch7 torch 简体版

原文原文鏈接

Summary on deep learning framework --- Torch7 html

2018-07-22 21:30:28linux

1. 嘗試第一個 CNN 的 torch版本, 代碼以下:git

 1 -- We now have 5 steps left to do in training our first torch neural network
 2 -- 1. Load and normalize data
 3 -- 2. Define Neural Network
 4 -- 3. Define Loss function
 5 -- 4. Train network on training data
 6 -- 5. Test network on test data.
 7 
 8 
 9 
 10 
 11 -- 1. Load and normalize data
 12 require 'paths'
 13 require 'image';  14 if (not paths.filep("cifar10torchsmall.zip")) then
 15     os.execute('wget -c https://s3.amazonaws.com/torch7/data/cifar10torchsmall.zip')  16     os.execute('unzip cifar10torchsmall.zip')  17 end
 18 trainset = torch.load('cifar10-train.t7')  19 testset = torch.load('cifar10-test.t7')  20 classes = {'airplane', 'automobile', 'bird', 'cat',  21            'deer', 'dog', 'frog', 'horse', 'ship', 'truck'}  22 
 23 print(trainset)  24 print(#trainset.data)  25 
 26 itorch.image(trainset.data[100]) -- display the 100-th image in dataset
 27 print(classes[trainset.label[100]])  28 
 29 -- ignore setmetatable for now, it is a feature beyond the scope of this tutorial.
 30 -- It sets the index operator 
 31 setmetatable(trainset,  32     {__index = function(t, i)  33                     return {t.data[i], t.label[i]}  34                 end}  35 );  36 trainset.data = trainset.data:double()  -- convert the data from a ByteTensor to a DoubleTensor.
 37 
 38 function trainset:size()  39     return self.data:size(1)  40 end
 41 
 42 print(trainset:size())  43 print(trainset[33])  44 itorch.image(trainset[33][11])  45 
 46 redChannel = trainset.data[{ {}, {1}, {}, {} }] -- this pick {all images, 1st channel, all vertical pixels, all horizontal pixels}
 47 print(#redChannel)  48 
 49 -- TODO:fill
 50 mean = {}  51 stdv = {}  52 for i = 1,3 do 
 53     mean[i] = trainset.data[{ {}, {i}, {}, {} }]:mean()  -- mean estimation 
 54     print('Channel ' .. i .. ' , Mean: ' .. mean[i])  55     trainset.data[{ {}, {i}, {}, {} }]:add(-mean[i]) -- mean subtraction 
 56 
 57     stdv[i] = trainset.data[ { {}, {i}, {}, {} }]:std()  -- std estimation 
 58     print('Channel ' .. i .. ' , Standard Deviation: ' .. stdv[i])  59     trainset.data[{ {}, {i}, {}, {} }]:div(stdv[i])  -- std scaling 
 60 end 
 61 
 62 
 63 
 64 -- 2. Define Neural Network
 65 net = nn.Sequential()  66 net:add(nn.SpatialConvolution(3, 6, 5, 5)) -- 3 input image channels, 6 output channels, 5x5 convolution kernel
 67 net:add(nn.ReLU())                       -- non-linearity 
 68 net:add(nn.SpatialMaxPooling(2,2,2,2))     -- A max-pooling operation that looks at 2x2 windows and finds the max.
 69 net:add(nn.SpatialConvolution(6, 16, 5, 5))  70 net:add(nn.ReLU())                       -- non-linearity 
 71 net:add(nn.SpatialMaxPooling(2,2,2,2))  72 net:add(nn.View(16*5*5))                    -- reshapes from a 3D tensor of 16x5x5 into 1D tensor of 16*5*5
 73 net:add(nn.Linear(16*5*5, 120))             -- fully connected layer (matrix multiplication between input and weights)
 74 net:add(nn.ReLU())                       -- non-linearity 
 75 net:add(nn.Linear(120, 84))  76 net:add(nn.ReLU())                       -- non-linearity 
 77 net:add(nn.Linear(84, 10))                   -- 10 is the number of outputs of the network (in this case, 10 digits)
 78 net:add(nn.LogSoftMax())                     -- converts the output to a log-probability. Useful for classification problems
 79 
 80 
 81 -- 3. Let us difine the Loss function 
 82 criterion = nn.ClassNLLCriterion()  83 
 84 
 85 
 86 -- 4. Train the neural network 
 87 trainer = nn.StochasticGradient(net, criterion)  88 trainer.learningRate = 0.001
 89 trainer.maxIteration = 5 -- just do 5 epochs of training. 
 90 trainer:train(trainset)  91 
 92 
 93 
 94 -- 5. Test the network, print accuracy
 95 print(classes[testset.label[100]])  96 itorch.image(testset.data[100])  97 
 98 testset.data = testset.data:double()   -- convert from Byte tensor to Double tensor
 99 for i=1,3 do -- over each image channel
100     testset.data[{ {}, {i}, {}, {}  }]:add(-mean[i]) -- mean subtraction 
101     testset.data[{ {}, {i}, {}, {}  }]:div(stdv[i]) -- std scaling
102 end
103 
104 -- for fun, print the mean and standard-deviation of example-100
105 horse = testset.data[100] 106 print(horse:mean(), horse:std()) 107 
108 print(classes[testset.label[100]]) 109 itorch.image(testset.data[100]) 110 predicted = net:forward(testset.data[100]) 111  
112 -- the output of the network is Log-Probabilities. To convert them to probabilities, you have to take e^x 
113 print(predicted:exp()) 114 
115 
116 for i=1,predicted:size(1) do
117     print(classes[i], predicted[i]) 118 end
119 
120 
121 -- test the accuracy 
122 correct = 0
123 for i=1,10000 do
124     local groundtruth = testset.label[i] 125     local prediction = net:forward(testset.data[i]) 126     local confidences, indices = torch.sort(prediction, true)  -- true means sort in descending order
127     if groundtruth == indices[1] then
128         correct = correct + 1
129     end
130 end
131 
132 
133 print(correct, 100*correct/10000 .. ' % ') 134 
135 class_performance = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0} 136 for i=1,10000 do
137     local groundtruth = testset.label[i] 138     local prediction = net:forward(testset.data[i]) 139     local confidences, indices = torch.sort(prediction, true)  -- true means sort in descending order
140     if groundtruth == indices[1] then
141         class_performance[groundtruth] = class_performance[groundtruth] + 1
142     end
143 end
144 
145 
146 for i=1,#classes do
147     print(classes[i], 100*class_performance[i]/1000 .. ' %') 148 end
149 
150 require 'cunn'; 151 net = net:cuda() 152 criterion = criterion:cuda() 153 trainset.data = trainset.data:cuda() 154 trainset.label = trainset.label:cuda() 155 
156 trainer = nn.StochasticGradient(net, criterion) 157 trainer.learningRate = 0.001
158 trainer.maxIteration = 5 -- just do 5 epochs of training.
159 
160 
161 trainer:train(trainset)

View Code

　　那麼,運行起來卻出現以下的問題:
github

　　(1).windows

/home/wangxiao/torch/install/bin/luajit: ./train_network.lua:26: attempt to index global 'itorch' (a nil value)
stack traceback:
./train_network.lua:26: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
wangxiao@AHU:~/Documents/Lua test examples$ 緩存

　　主要是 itorch 的問題, 另外就是要引用 require 'nn' 來解決沒法辨別 nn 的問題.bash

　　我是把帶有 itorch 的那些行都暫時註釋了.less

2. 'libcudnn (R5) not found in library path.ide

wangxiao@AHU:~/Downloads/wide-residual-networks-master$ th ./train_Single_Multilabel_Image_Classification.lua
nil
/home/wangxiao/torch/install/bin/luajit: /home/wangxiao/torch/install/share/lua/5.1/trepl/init.lua:384: /home/wangxiao/torch/install/share/lua/5.1/trepl/init.lua:384: /home/wangxiao/torch/install/share/lua/5.1/cudnn/ffi.lua:1600: 'libcudnn (R5) not found in library path.
Please install CuDNN from https://developer.nvidia.com/cuDNN
Then make sure files named as libcudnn.so.5 or libcudnn.5.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)oop

stack traceback:
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.1/trepl/init.lua:384: in function 'require'
./train_Single_Multilabel_Image_Classification.lua:8: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: at 0x00406670
wangxiao@AHU:~/Downloads/wide-residual-networks-master$

================================================================>>

答案是:

　　從新下載了 cudnn-7.5-linux-x64-v5.0-ga.tgz

　　而且從新配置了,可是依然提醒這個問題,那麼,問題何在呢?查看了博客:http://blog.csdn.net/hungryof/article/details/51557666 中的內容:

坑4 可能出現’libcudnn not found in library path’的狀況

截取其中一段錯誤信息：

Please install CuDNN from https://developer.nvidia.com/cuDNN Then make sure files named as libcudnn.so.5 or libcudnn.5.dylib are placed in your library load path (for example /usr/local/lib , or manually add a path to LD_LIBRARY_PATH)

LD_LIBRARY_PATH是該環境變量，主要用於指定查找共享庫（動態連接庫）時除了默認路徑以外的其餘路徑。因爲剛纔已經將
「libcudnn*」複製到了/usr/local/cuda-7.5/lib64/下面，所以須要

sudo gedit /etc/ld.so.conf.d/cudnn.conf 就是新建一個conf文件。名字隨便
加入剛纔的路徑/usr/local/cuda-7.5/lib64/
反正我還添加了/usr/local/cuda-7.5/include/,這個估計不要也行。
保存後，再sudo ldconfig來更新緩存。（可能會出現libcudnn.so.5不是符號鏈接的問題，不過無所謂了！！）

此時運行

th neural_style.lua -gpu 0 -backend cudnn

成功了！！！！

============================================================>>>>

評價:　　按照這種作法試了,確實成功了! 贊一個 !!!

　　3. 利用 gm 加載圖像時,提示錯誤,可是裝上那個包仍然提示錯誤:

Load library:

gm = require 'graphicsmagick'

First, we provide two high-level functions to load/save directly into/form tensors:

img = gm.load('/path/to/image.png' [, type]) -- type = 'float' (default) | 'double' | 'byte' gm.save('/path/to/image.jpg' [,quality]) -- quality = 0 to 100 (for jpegs only)

The following provide a more controlled flow for loading/saving jpegs.

Create an image, from a file:

image = gm.Image('/path/to/image.png') -- or image = gm.Image() image:load('/path/to/image.png')

　　可是悲劇的仍然有錯, 只好換了用 image.load() 的方式加載圖像:

--To load as byte tensor for rgb imagefile
local img = image.load(imagefile,3,'byte')

 
　　4. Torch 保存 txt 文件：
　　-- save opt 
　　file = torch.DiskFile(paths.concat(opt.checkpoints_dir, opt.name, 'opt.txt'), 'w') 
　　file:writeObject(opt) 
　　file:close() 
　　
　　5. Torch 建立新的文件夾
　　opts.modelPath = opt.modelDir .. opt.modelName 
　　if not paths.dirp(opt.modelPath) then 
　　　　paths.mkdir(opts.modelPath) 
　　end 

　　6. Torch Lua 保存 圖像到文件夾 
　　藉助 image package，首先安裝： luarocks install image 
　　而後 require 'image' 
　　就可使用了： local img = image.save('./saved_pos_neg_image/candidate_' .. tostring(i) .. tostring(j) .. '.png', pos_patch, 1, 32, 32) 

　　7. module 'bit' not found:No LuaRocks module found for bit

wangxiao@AHU:/media/wangxiao/724eaeef-e688-4b09-9cc9-dfaca44079b2/fast-neural-style-master$ th ./train.lua
/home/wangxiao/torch/install/bin/lua: /home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: /home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: /home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: module 'bit' not found:No LuaRocks module found for bit
no field package.preload['bit']
no file '/home/wangxiao/.luarocks/share/lua/5.2/bit.lua'
no file '/home/wangxiao/.luarocks/share/lua/5.2/bit/init.lua'
no file '/home/wangxiao/torch/install/share/lua/5.2/bit.lua'
no file '/home/wangxiao/torch/install/share/lua/5.2/bit/init.lua'
no file '/home/wangxiao/.luarocks/share/lua/5.1/bit.lua'
no file '/home/wangxiao/.luarocks/share/lua/5.1/bit/init.lua'
no file '/home/wangxiao/torch/install/share/lua/5.1/bit.lua'
no file '/home/wangxiao/torch/install/share/lua/5.1/bit/init.lua'
no file './bit.lua'
no file '/home/wangxiao/torch/install/share/luajit-2.1.0-beta1/bit.lua'
no file '/usr/local/share/lua/5.1/bit.lua'
no file '/usr/local/share/lua/5.1/bit/init.lua'
no file '/home/wangxiao/.luarocks/lib/lua/5.2/bit.so'
no file '/home/wangxiao/torch/install/lib/lua/5.2/bit.so'
no file '/home/wangxiao/torch/install/lib/bit.so'
no file '/home/wangxiao/.luarocks/lib/lua/5.1/bit.so'
no file '/home/wangxiao/torch/install/lib/lua/5.1/bit.so'
no file './bit.so'
no file '/usr/local/lib/lua/5.1/bit.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.2/trepl/init.lua:389: in function 'require'
./train.lua:5: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: in ?
wangxiao@AHU:/media/wangxiao/724eaeef-e688-4b09-9cc9-dfaca44079b2/fast-neural-style-master$


在終端中執行：luarocks install luabitop 
就能夠了。

8.　　HDF5Group:read() - no such child 'media' for [HDF5Group 33554432 /]

/home/wangxiao/torch/install/bin/lua: /home/wangxiao/torch/install/share/lua/5.2/hdf5/group.lua:312: HDF5Group:read() - no such child 'media' for [HDF5Group 33554432 /]
stack traceback:
[C]: in function 'error'
/home/wangxiao/torch/install/share/lua/5.2/hdf5/group.lua:312: in function </home/wangxiao/torch/install/share/lua/5.2/hdf5/group.lua:302>
(...tail calls...)
./fast_neural_style/DataLoader.lua:44: in function '__init'
/home/wangxiao/torch/install/share/lua/5.2/torch/init.lua:91: in function </home/wangxiao/torch/install/share/lua/5.2/torch/init.lua:87>
[C]: in function 'DataLoader'
./train.lua:138: in function 'main'
./train.lua:327: in main chunk
[C]: in function 'dofile'
...xiao/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:145: in main chunk
[C]: in ?

最近在訓練類型遷移的代碼，發現這個蛋疼的問題。哎。。糾結好幾天了。。這個 hdf5 到底怎麼回事？求解釋！！！

------------------------------------------------------------------------------------------------

　　後來發現, 是我本身的數據集路徑設置的有問題, 如: 應該是 CoCo/train/image/

　　可是,我只是給定了 CoCo/train/ ...

　　9. 怎麼設置 torch代碼在哪塊 GPU 上運行 ? 或者怎麼設置在兩塊卡上同時運行 ?

　　如圖所示: export CUDA_VISIBLE_DEVICES=0 便可指定代碼在 GPU-0 上運行.

　　10. When load the pre-trained VGG model, got the following errors:

　　　　MODULE data UNDEFINED
　　　　warning: module 'data [type 5]' not found
　　　　nn supports no groups!
　　　　warning: module 'conv2 [type 4]' not found
　　　　nn supports no groups!
　　　　warning: module 'conv4 [type 4]' not found
　　　　nn supports no groups!
　　　　warning: module 'conv5 [type 4]' not found

 1 using cudnn  2 Successfully loaded ./feature_transfer/AlexNet_files/bvlc_alexnet.caffemodel  3 MODULE data UNDEFINED  4 warning: module 'data [type 5]' not found  5 nn supports no groups!
 6 warning: module 'conv2 [type 4]' not found  7 nn supports no groups!
 8 warning: module 'conv4 [type 4]' not found  9 nn supports no groups!
10 warning: module 'conv5 [type 4]' not found

 1 wangxiao@AHU:~/Downloads/multi-modal-visual-tracking$ qlua ./train_match_function_alexNet_version_2017_02_28.lua  2 using cudnn  3 Successfully loaded ./feature_transfer/AlexNet_files/bvlc_alexnet.caffemodel  4 MODULE data UNDEFINED  5 warning: module 'data [type 5]' not found  6 nn supports no groups!
 7 warning: module 'conv2 [type 4]' not found  8 nn supports no groups!
 9 warning: module 'conv4 [type 4]' not found  10 nn supports no groups!
 11 warning: module 'conv5 [type 4]' not found  12 conv1: 96 3 11 11
 13 conv3: 384 256 3 3
 14 fc6: 1 1 9216 4096
 15 fc7: 1 1 4096 4096
 16 fc8: 1 1 4096 1000
 17 nn.Sequential {  18 [input -> (1) -> (2) -> (3) -> output]  19 (1): nn.SplitTable  20 (2): nn.ParallelTable {  21 input  22 |`-> (1): nn.Sequential {  23 | [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> output]  24 | (1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4)  25 | (2): nn.ReLU  26 | (3): nn.SpatialCrossMapLRN  27 | (4): nn.SpatialMaxPooling(3x3, 2,2)  28 | (5): nn.ReLU  29 | (6): nn.SpatialCrossMapLRN  30 | (7): nn.SpatialMaxPooling(3x3, 2,2)  31 | (8): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)  32 | (9): nn.ReLU  33 | (10): nn.ReLU  34 | (11): nn.ReLU  35 | (12): nn.SpatialMaxPooling(3x3, 2,2)  36 | (13): nn.View(-1)  37 | (14): nn.Linear(9216 -> 4096)  38 | (15): nn.ReLU  39 | (16): nn.Dropout(0.500000)  40 | (17): nn.Linear(4096 -> 4096)  41 | (18): nn.ReLU  42 | }  43 `-> (2): nn.Sequential {  44 [input -> (1) -> (2) -> (3) -> (4) -> (5) -> (6) -> (7) -> (8) -> (9) -> (10) -> (11) -> (12) -> (13) -> (14) -> (15) -> (16) -> (17) -> (18) -> output]  45 (1): nn.SpatialConvolution(3 -> 96, 11x11, 4,4)  46 (2): nn.ReLU  47 (3): nn.SpatialCrossMapLRN  48 (4): nn.SpatialMaxPooling(3x3, 2,2)  49 (5): nn.ReLU  50 (6): nn.SpatialCrossMapLRN  51 (7): nn.SpatialMaxPooling(3x3, 2,2)  52 (8): nn.SpatialConvolution(256 -> 384, 3x3, 1,1, 1,1)  53 (9): nn.ReLU  54 (10): nn.ReLU  55 (11): nn.ReLU  56 (12): nn.SpatialMaxPooling(3x3, 2,2)  57 (13): nn.View(-1)  58 (14): nn.Linear(9216 -> 4096)  59 (15): nn.ReLU  60 (16): nn.Dropout(0.500000)  61 (17): nn.Linear(4096 -> 4096)  62 (18): nn.ReLU  63 }  64 ... -> output  65 }  66 (3): nn.PairwiseDistance  67 }  68 =================================================================================================================
 69 ================= AlextNet based Siamese Search for Visual Tracking ========================
 70 =================================================================================================================
 71 ==>> The Benchmark Contain: 36 videos ...  72 deal with video 1/36 video name: BlurFace ... please waiting ...  73 the num of gt bbox: 493
 74 the num of video frames: 493
 75 ========>>>> Begin to track 2 video name: nil-th frame, please waiting ...  76 ========>>>> Begin to track 3 video name: nil-th frame, please waiting ... ............] ETA: 0ms | Step: 0ms  77 ========>>>> Begin to track 4 video name: nil-th frame, please waiting ... ............] ETA: 39s424ms | Step: 80ms  78 ========>>>> Begin to track 5 video name: nil-th frame, please waiting ... ............] ETA: 33s746ms | Step: 69ms  79 ========>>>> Begin to track 6 video name: nil-th frame, please waiting ... ............] ETA: 31s817ms | Step: 65ms  80 ========>>>> Begin to track 7 video name: nil-th frame, please waiting ... ............] ETA: 32s575ms | Step: 66ms  81 ========>>>> Begin to track 8 video name: nil-th frame, please waiting ... ............] ETA: 34s376ms | Step: 70ms  82 ========>>>> Begin to track 9 video name: nil-th frame, please waiting ... ............] ETA: 40s240ms | Step: 82ms  83 ========>>>> Begin to track 10 video name: nil-th frame, please waiting ... ...........] ETA: 44s211ms | Step: 91ms  84 ========>>>> Begin to track 11 video name: nil-th frame, please waiting ... ...........] ETA: 45s993ms | Step: 95ms  85 ========>>>> Begin to track 12 video name: nil-th frame, please waiting ... ...........] ETA: 47s754ms | Step: 99ms  86 ========>>>> Begin to track 13 video name: nil-th frame, please waiting ... ...........] ETA: 50s392ms | Step: 104ms  87 ========>>>> Begin to track 14 video name: nil-th frame, please waiting ... ...........] ETA: 53s138ms | Step: 110ms  88 ========>>>> Begin to track 15 video name: nil-th frame, please waiting ... ...........] ETA: 55s793ms | Step: 116ms  89 ========>>>> Begin to track 16 video name: nil-th frame, please waiting ... ...........] ETA: 59s253ms | Step: 123ms  90 ========>>>> Begin to track 17 video name: nil-th frame, please waiting ... ...........] ETA: 1m2s | Step: 130ms  91 ========>>>> Begin to track 18 video name: nil-th frame, please waiting ... ...........] ETA: 1m5s | Step: 137ms  92 ========>>>> Begin to track 19 video name: nil-th frame, please waiting ... ...........] ETA: 1m8s | Step: 143ms  93 ========>>>> Begin to track 20 video name: nil-th frame, please waiting ... ...........] ETA: 1m11s | Step: 149ms  94 //////////////////////////////////////////////////////////////////////////..............] ETA: 1m14s | Step: 157ms 
 95 ==>> pos_proposal_list: 19
 96 ==>> neg_proposal_list: 19
 97 qlua: /home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:67:  98 In 2 module of nn.Sequential:  99 In 1 module of nn.ParallelTable: 100 In 8 module of nn.Sequential: 101 /home/wangxiao/torch/install/share/lua/5.1/nn/THNN.lua:117: Need input of dimension 3 and input.size[0] == 256 but got input to be of shape: [96 x 13 x 13] at /tmp/luarocks_cunn-scm-1-6210/cunn/lib/THCUNN/generic/SpatialConvolutionMM.cu:49
102 stack traceback: 103 [C]: in function 'v'
104 /home/wangxiao/torch/install/share/lua/5.1/nn/THNN.lua:117: in function 'SpatialConvolutionMM_updateOutput'
105 ...ao/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:79: in function <...ao/torch/install/share/lua/5.1/nn/SpatialConvolution.lua:76>
106 [C]: in function 'xpcall'
107 /home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
108 ...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function <...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:41>
109 [C]: in function 'xpcall'
110 /home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
111 ...angxiao/torch/install/share/lua/5.1/nn/ParallelTable.lua:12: in function <...angxiao/torch/install/share/lua/5.1/nn/ParallelTable.lua:10>
112 [C]: in function 'xpcall'
113 /home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors'
114 ...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
115 ./train_match_function_alexNet_version_2017_02_28.lua:525: in function 'opfunc'
116 /home/wangxiao/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'optim'
117 ./train_match_function_alexNet_version_2017_02_28.lua:550: in main chunk 118 
119 
120 
121 WARNING: If you see a stack trace below, it doesn't point to the place where this error occurred. Please use only the one above.
122 stack traceback: 123 [C]: at 0x7f86014df9c0
124 [C]: in function 'error'
125 /home/wangxiao/torch/install/share/lua/5.1/nn/Container.lua:67: in function 'rethrowErrors'
126 ...e/wangxiao/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward'
127 ./train_match_function_alexNet_version_2017_02_28.lua:525: in function 'opfunc'
128 /home/wangxiao/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'optim'
129 ./train_match_function_alexNet_version_2017_02_28.lua:550: in main chunk 130 wangxiao@AHU:~/Downloads/multi-modal-visual-tracking$

　　Just like the screen shot above, change the 'nn' into 'cudnn' will be ok and passed.

　　11. both (null) and torch.FloatTensor have no less-than operator

　　　　qlua: ./test_MM_tracker_VGG_.lua:254: both (null) and torch.FloatTensor have no less-than operator
　　　　stack traceback:
　　　　[C]: at 0x7f628816e9c0
　　　　[C]: in function '__lt'
　　　　./test_MM_tracker_VGG_.lua:254: in main chunk

　　Because it is floatTensor () style and you can change it like this if you want this value printed in a for loop:　predictValue -->> predictValue[i] .

　　12.

========>>>> Begin to track the 6-th and the video name is ILSVRC2015_train_00109004 , please waiting ...
THCudaCheck FAIL file=/tmp/luarocks_cutorch-scm-1-707/cutorch/lib/THC/generic/THCStorage.cu line=66 error=2 : out of memory
qlua: cuda runtime error (2) : out of memory at /tmp/luarocks_cutorch-scm-1-707/cutorch/lib/THC/generic/THCStorage.cu:66
stack traceback:
[C]: at 0x7fa20a8f99c0
[C]: at 0x7fa1dddfbee0
[C]: in function 'Tensor'
./train_match_function_VGG_version_2017_03_02.lua:377: in main chunk
wangxiao@AHU:~/Downloads/multi-modal-visual-tracking$

Yes, it is just out of memory of GPU. Just turn the batchsize to a small value, it may work. It worked for me. Ha ha ...

13. luarocks install class does not have any effect, it still shown me the error: No Module named "class" in Torch.

　　==>> in terminal, install this package in sudo.

　　==>> then, it will be OK.

14. How to install opencv 3.1 on Ubuntu 14.04 ???

　　As we can found from: http://blog.csdn.net/a125930123/article/details/52091140

　　1. first, you should install torch successfully ;

　　2. then, just follow what the blog said here:

安裝opencv3.1

一、安裝必要的包

sudo apt-get install build-essential
sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev

二、下載opencv3.1

http://opencv.org/downloads.html
解壓：unzip opencv-3.1.0

三、安裝
cd ~/opencv-3.1.0
mkdir build
cd build
cmake -D CMAKE_BUILD_TYPE=Release -D CMAKE_INSTALL_PREFIX=/usr/local ..

sudo make -j24 
sudo make install -j24  
sudo /bin/bash -c 'echo "/usr/local/lib" > /etc/ld.so.conf.d/opencv.conf'
sudo ldconfig

安裝完成

四、問題

的問題。在安裝過程當中可能會出現沒法下載ippicv_linux_20151201.tgz

解決方案：

：https://raw.githubusercontent.com/Itseez/opencv_3rdparty/81a676001ca8075ada498583e4166079e5744668/ippicv/ippicv_linux_20151201.tgz手動下載ippicv_linux_20151201.tgz

將下載好的文件 放入 中，若是已經存在 ，則替換掉，這樣就能夠安裝完成了。opencv-3.1.0/3rdparty/ippicv/downloads/linux-808b791a6eac9ed78d32a7666804320e

五、最後執行命令

luarocks install cv

OpenCV bindings for Torch安裝成功。

But, maybe you may found some errors, such as:

cudalegacy/src/graphcuts.cpp:120:54: error: ‘NppiGraphcutState’ has not been declared (solution draw from: http://blog.csdn.net/allyli0022/article/details/62859290)

At this moment, you need to change some files:

found graphcuts.cpp in opencv3.1, and do the following changes:

解決方案：須要修改一處源碼： 在graphcuts.cpp中將

#if !defined (HAVE_CUDA) || defined (CUDA_DISABLER)

改成

#if !defined (HAVE_CUDA) || defined (CUDA_DISABLER) || (CUDART_VERSION >= 8000) 

then, try again, it will be ok...this code just want to make opencv3.1 work under cuda 8.0, you know...skip that judge sentence...

15. 安裝torch-hdf5

sudo apt-get install libhdf5-serial-dev hdf5-tools git clone https://github.com/deepmind/torch-hdf5 
cd torch-hdf5 sudo luarocks make hdf5-0-0.rockspec LIBHDF5_LIBDIR=」/usr/lib/x86_64-Linux-gnu/」

17. iTorch安裝

git clone https://github.com/zeromq/zeromq4-1.git 
mkdir build-zeromq cd build-zeromq cmake .. make && make install 安裝完以後，luarocks install itorch 以後能夠經過luarocks list查看是否安裝成功

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。