本文地址:http://blog.csdn.net/mounty_fsc/article/details/51085654css
Caffe中,Blob。Layer,Net,Solver是最爲核心的類,下面介紹這幾個類,Solver將在下一節介紹。html
Blob是:c++
/** * @brief A wrapper around SyncedMemory holders serving as the basic * computational unit through which Layer%s, Net%s, and Solver%s * interact. * * TODO(dox): more thorough description. */
template <typename Dtype>
class Blob {
public:
Blob()
: data_(), diff_(), count_(0), capacity_(0) {}
/// @brief Deprecated; use <code>Blob(const vector<int>& shape)</code>.
explicit Blob(const int num, const int channels, const int height,
const int width);
explicit Blob(const vector<int>& shape);
.....
protected:
shared_ptr<SyncedMemory> data_;
shared_ptr<SyncedMemory> diff_;
shared_ptr<SyncedMemory> shape_data_;
vector<int> shape_;
int count_;
int capacity_;
DISABLE_COPY_AND_ASSIGN(Blob);
}; // class Blob
注:此處僅僅保留了構造函數與成員變量。git
說明:github
由源碼中可以注意到Blob有個成員變量:vector shape_
其做用:數組
以Blob中二維矩陣爲例(如全鏈接網絡shape (N, D))。如圖所看到的。一樣的存儲方式可以推廣到多維。markdown
由1.2知。Blob本質是對SyncedMemory的再封裝。網絡
其核心代碼例如如下:app
/**
* @brief Manages memory allocation and synchronization between the host (CPU)
* and device (GPU).
*
* TODO(dox): more thorough description.
*/
class SyncedMemory {
public:
...
const void* cpu_data();
const void* gpu_data();
void* mutable_cpu_data();
void* mutable_gpu_data();
...
private:
...
void* cpu_ptr_;
void* gpu_ptr_;
...
}; // class SyncedMemory
Blob同一時候保存了data_和diff_,其類型爲SyncedMemory的指針。
對於data_(diff_一樣),事實上際值要麼存儲在CPU(cpu_ptr_)要麼存儲在GPU(gpu_ptr_),有兩種方式訪問CPU數據(GPU一樣):框架
可變方式,void* mutable_cpu_data(),其可改變cpu_ptr_指向存儲區值。
以mutable_cpu_data()爲例
void* SyncedMemory::mutable_cpu_data() {
to_cpu();
head_ = HEAD_AT_CPU;
return cpu_ptr_;
}
inline void SyncedMemory::to_cpu() {
switch (head_) {
case UNINITIALIZED:
CaffeMallocHost(&cpu_ptr_, size_, &cpu_malloc_use_cuda_);
caffe_memset(size_, 0, cpu_ptr_);
head_ = HEAD_AT_CPU;
own_cpu_data_ = true;
break;
case HEAD_AT_GPU:
#ifndef CPU_ONLY
if (cpu_ptr_ == NULL) {
CaffeMallocHost(&cpu_ptr_, size_, &cpu_malloc_use_cuda_);
own_cpu_data_ = true;
}
caffe_gpu_memcpy(size_, gpu_ptr_, cpu_ptr_);
head_ = SYNCED;
#else
NO_GPU;
#endif
break;
case HEAD_AT_CPU:
case SYNCED:
break;
}
}
說明:
Layer是Caffe的基礎以及基本計算單元。Caffe十分強調網絡的層次性,可以說。一個網絡的大部分功能都是以Layer的形式去展開的,如convolute,pooling,loss等等。
在建立一個Caffe模型的時候,也是以Layer爲基礎進行的,需依照src/caffe/proto/caffe.proto中定義的網絡及參數格式定義網絡 prototxt文件(需瞭解google protocol buffer)
如圖,名爲conv1的Layer 的輸入是名爲data的bottom blob,其輸出是名爲conv1的top blob。
其protobuff定義例如如下,一個layer有一個到多個的top和bottom,其相應於blob
layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" .... }
/** * Layer%s must implement a Forward function, in which they take their input * (bottom) Blob%s (if any) and compute their output Blob%s (if any). * They may also implement a Backward function, in which they compute the error * gradients with respect to their input Blob%s, given the error gradients with * their output Blob%s. */
template <typename Dtype>
class Layer {
public:
/** * You should not implement your own constructor. Any set up code should go * to SetUp(), where the dimensions of the bottom blobs are provided to the * layer. */
explicit Layer(const LayerParameter& param)
: layer_param_(param), is_shared_(false) {
...
}
virtual ~Layer() {}
/** * @brief Implements common layer setup functionality. * @param bottom the preshaped input blobs * @param top * the allocated but unshaped output blobs, to be shaped by Reshape */
void SetUp(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
...
}
...
/** * @brief Given the bottom blobs, compute the top blobs and the loss. * \return The total loss from the layer. * * The Forward wrapper calls the relevant device wrapper function * (Forward_cpu or Forward_gpu) to compute the top blob values given the * bottom blobs. If the layer has any non-zero loss_weights, the wrapper * then computes and returns the loss. * * Your layer should implement Forward_cpu and (optionally) Forward_gpu. */
inline Dtype Forward(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top);
/** * @brief Given the top blob error gradients, compute the bottom blob error * gradients. * * @param top * the output blobs, whose diff fields store the gradient of the error * with respect to themselves * @param propagate_down * a vector with equal length to bottom, with each index indicating * whether to propagate the error gradients down to the bottom blob at * the corresponding index * @param bottom * the input blobs, whose diff fields will store the gradient of the error * with respect to themselves after Backward is run * * The Backward wrapper calls the relevant device wrapper function * (Backward_cpu or Backward_gpu) to compute the bottom blob diffs given the * top blob diffs. * * Your layer should implement Backward_cpu and (optionally) Backward_gpu. */
inline void Backward(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom);
...
protected:
/** The protobuf that stores the layer parameters */
LayerParameter layer_param_;
/** The phase: TRAIN or TEST */
Phase phase_;
/** The vector that stores the learnable parameters as a set of blobs. */
vector<shared_ptr<Blob<Dtype> > > blobs_;
/** Vector indicating whether to compute the diff of each param blob. */
vector<bool> param_propagate_down_;
/** The vector that indicates whether each top blob has a non-zero weight in * the objective function. */
vector<Dtype> loss_;
virtual void Forward_cpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) = 0;
virtual void Forward_gpu(const vector<Blob<Dtype>*>& bottom,
const vector<Blob<Dtype>*>& top) {
// LOG(WARNING) << "Using CPU code as backup.";
return Forward_cpu(bottom, top);
}
virtual void Backward_cpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) = 0;
virtual void Backward_gpu(const vector<Blob<Dtype>*>& top,
const vector<bool>& propagate_down,
const vector<Blob<Dtype>*>& bottom) {
// LOG(WARNING) << "Using CPU code as backup.";
Backward_cpu(top, propagate_down, bottom);
}
...
}; // class Layer
說明:每一層定義了三種操做
在Layer的派生類中,主要可以分爲Vision Layers
如:Inner Product(InnerProduct)、Splitting(Split)、Flattening(Flatten)、Reshape(Reshape)、Concatenation(Concat)、Slicing(Slice)、Elementwise(Eltwise)、Argmax(ArgMax)、Softmax(Softmax)、Mean-Variance Normalization(MVN)等
注,括號內爲Layer Type,沒有括號暫缺信息。具體咱見引用2
一個Net由多個Layer組成。
一個典型的網絡從data layer(從磁盤中加載數據)出發到loss layer結束。如圖是一個簡單的邏輯迴歸分類器。
例如如下定義:
name: "LogReg"
layer { name: "mnist" type: "Data" top: "data" top: "label" data_param { source: "input_leveldb" batch_size: 64 }
}
layer { name: "ip" type: "InnerProduct" bottom: "data" top: "ip" inner_product_param { num_output: 2 }
}
layer { name: "loss" type: "SoftmaxWithLoss" bottom: "ip" bottom: "label" top: "loss" }
/** * @brief Connects Layer%s together into a directed acyclic graph (DAG) * specified by a NetParameter. * * TODO(dox): more thorough description. */
template <typename Dtype>
class Net {
public:
...
/// @brief Initialize a network with a NetParameter.
void Init(const NetParameter& param);
...
const vector<Blob<Dtype>*>& Forward(const vector<Blob<Dtype>* > & bottom,
Dtype* loss = NULL);
...
/** * The network backward should take no input and output, since it solely * computes the gradient w.r.t the parameters, and the data has already been * provided during the forward pass. */
void Backward();
...
Dtype ForwardBackward(const vector<Blob<Dtype>* > & bottom) {
Dtype loss;
Forward(bottom, &loss);
Backward();
return loss;
}
...
protected:
...
/// @brief The network name
string name_;
/// @brief The phase: TRAIN or TEST
Phase phase_;
/// @brief Individual layers in the net
vector<shared_ptr<Layer<Dtype> > > layers_;
/// @brief the blobs storing intermediate results between the layer.
vector<shared_ptr<Blob<Dtype> > > blobs_;
vector<vector<Blob<Dtype>*> > bottom_vecs_;
vector<vector<Blob<Dtype>*> > top_vecs_;
...
/// The root net that actually holds the shared layers in data parallelism
const Net* const root_net_;
};
} // namespace caffe
說明:
參考文獻:
[1].http://caffe.berkeleyvision.org/tutorial/net_layer_blob.html
[2].http://caffe.berkeleyvision.org/tutorial/layers.html
[3].https://yufeigan.github.io
[4].https://www.zhihu.com/question/27982282