Mxnet學習筆記(3)--自定義Op

時間 2019-12-06

標籤 mxnet 學習筆記自定義简体版

原文原文鏈接

https://blog.csdn.net/u011765306/article/details/54562282dom

前言　　函數

今天由於要用到tile操做(相似np.tile，將數據沿axises進行數據擴充)，結果發現mxnet中沒有，並且不少操做都沒實現，詳細完成
度能夠參看issue,還在完成中，不過這並不影響咱們要用的操做，這裏咱們
須要實現本身的Op。固然，在官方的example/numpy-ops中已經給出部分例子。這裏具體的記錄一下。　　.net

自定義Op設計

自定義op都是去繼承operator.py中的類，其中提供以下幾類：　　orm

operator.py
CustomOp(object)
CustomOpProp(object)
NDArrayOp(PythonOp)
NumpyOp(PythonOp)
PythonOp(object)
這裏很清晰的能夠看出，operator分爲兩條路線，一條路線爲CustomOp, 另一條路線爲繼承PythonOp,這裏咱們就分爲兩部分分別介紹這兩條路線。　　對象

CustomOp類　　blog

這條路線是有三步組成，第一步繼承CustomOp,重寫方法forward()和backward(),而後繼承CustomOpProp,重寫成員方法，並在方法create_operator中
調用以前寫好的Op,第三步調用operator.register()對操做進行註冊。具體咱們結合官方代碼example/numpy-ops/custom_softmax.py來解釋,代碼以下：　　繼承

class Softmax(mx.operator.CustomOp):
def forward(self, is_train, req, in_data, out_data, aux):
x = in_data[0].asnumpy()
y = np.exp(x - x.max(axis=1).reshape((x.shape[0], 1)))
y /= y.sum(axis=1).reshape((x.shape[0], 1))
self.assign(out_data[0], req[0], mx.nd.array(y))接口

def backward(self, req, out_grad, in_data, out_data, in_grad, aux):
l = in_data[1].asnumpy().ravel().astype(np.int)
y = out_data[0].asnumpy()
y[np.arange(l.shape[0]), l] -= 1.0
self.assign(in_grad[0], req[0], mx.nd.array(y))內存

@mx.operator.register("softmax")
class SoftmaxProp(mx.operator.CustomOpProp):
def __init__(self):
super(SoftmaxProp, self).__init__(need_top_grad=False)

def list_arguments(self):
return ['data', 'label']

def list_outputs(self):
return ['output']

def infer_shape(self, in_shape):
data_shape = in_shape[0]
label_shape = (in_shape[0][0],)
output_shape = in_shape[0]
return [data_shape, label_shape], [output_shape], []

def create_operator(self, ctx, shapes, dtypes):
return Softmax()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
上述代碼是對softmax的自定義，在類Softmax中重寫forward()和backward()，這裏與caffe中定義層操做相似，forward()中定義層的前向操做，backward()中
定義反向傳播的梯度計算。在完成定義以後，在類SoftmaxProp中create_operator()調用並返回Softmax()實例。那麼第三步register如何實現，能夠看到，
在SoftmaxProp中帶有裝飾器mx.operator.register(),等價於操做register("custom_op")(CustomOpProp),這裏即在代碼運行前即完成了該Op的
實例化，與optimazer的裝飾器相似。　　

PythonOp類　　

這條路線，PythonOp類爲基類，而咱們大多定義Op時不會去繼承它，而是使用他的subclass: NDarrayOp、NumpyOp。這條路線不會像繼承CustomOp那樣須要三步，這裏咱們也是隻討論如何繼承並定義操做，不去探究
這兩個類的實現細節。仍是拿官網例子來說。上代碼：　　

class NDArraySoftmax(mx.operator.NDArrayOp):
def __init__(self):
super(NDArraySoftmax, self).__init__(False)
self.fwd_kernel = None
self.bwd_kernel = None

def list_arguments(self):
return ['data', 'label']

def list_outputs(self):
return ['output']

def infer_shape(self, in_shape):
data_shape = in_shape[0]
label_shape = (in_shape[0][0],)
output_shape = in_shape[0]
return [data_shape, label_shape], [output_shape]

def forward(self, in_data, out_data):
x = in_data[0]
y = out_data[0]
if self.fwd_kernel is None:
self.fwd_kernel = mx.rtc('softmax', [('x', x)], [('y', y)])
self.fwd_kernel.push([x], [y], (1, 1, 1), (x.shape[0], 1, 1))

def backward(self, out_grad, in_data, out_data, in_grad):
l = in_data[1]
y = out_data[0]
dx = in_grad[0]
if self.bwd_kernel is None:
self.bwd_kernel = mx.rtc('softmax_grad', [('y', y), ('l', l)], [('dx', dx)])
self.bwd_kernel.push([y,l], [dx], (y.shape[0],1,1), (y.shape[1], 1, 1))
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
繼承NDArrayOp其實和NumpyOp相似，不一樣之處在於forward()和backward()重寫方式使用函數不一樣，NDArrayOp中須要使用mx.nd中的操做，而
NumpyOp可使用numpy中的操做。總之重點在forward()和backward()。固然，如此的自定義方法在使用時須要先定義類對象纔可使用。即與CunstomOp
的定義時間不一樣。　

成員方法list_arguments，list_outpus，infer_shape

雖然繼承方法不一樣，可是效果是同樣的，forward()和backward()是對Op操做的定義，剩餘三個成員方法則是對Op接口的描述。　　

list_arguments 　　

該方法主要是對該Op定義時形參的命名，如上述多爲['data', 'label'],那麼該Op在使用時形參必須爲data和label。這裏也能夠看出mxnet是用過名字
尋找變量的，DataIter,optimazer也是如此。　　

list_outputs

一樣的，該方法定義了輸出變量的名字，通常爲opname+’_output’。　　

infer_shape

該方法用於在給定輸入時，獲取該Op的輸出shape。固然，在咱們自定義時，須要本身設計Op的輸入和輸出shape。　　

以上就是自定義Op時須要作的事情，重點仍是forward()和backward()，有時候無頭緒的時候能夠參考caffe的寫法得到靈感。接下來我用例子來講描述一下上述方法。　　

import mxnet as mx
import numpy as np
class TileLayer(mx.operator.NumpyOp):
def __init__(self, tiles, axis):
super(TileLayer, self).__init__(False)
# tiles能夠爲list或者一個數
self.tiles = tiles
self.axis = axis
def list_arguments(self):
return ['input']

def list_outputs(self):
return ['output']

def infer_shape(self, in_shape):
data_shape = in_shape[0]
output_shape = in_shape[0] + [self.tiles]
return [data_shape], [output_shape]

def forward(self, in_data, out_data):
x = in_data[0]
y = out_data[0]
y = np.tile(x, reps=self.tiles)

def backward(self, out_grad, in_data, out_data, in_grad):
bottom_diff = in_grad[0]
top_diff = np.sum(out_grad[0], axis=self.axis)
bottom_diff = top_diff

if __name__ == '__main__':
import logging
from collections import namedtuple
Batch = namedtuple('Batch', ['input'])
logging.basicConfig(level=logging.INFO)
a = mx.sym.Variable('data')
custie = TileLayer(tiles=10, axis=2)
tiles_a = custie(input=a, name='tileop')
arg_shapes, out_shape, aux_shape = tiles_a.infer_shape(data=(2, 3))
logging.info('arg_shape:{}\n, out_shape:{}\n, aux_shape:{}\n, output_blob:{}'.format(arg_shapes, out_shape, aux_shape, tiles_a.list_outputs()))
exe = mx.module.Module(symbol=tiles_a, logger=logging)
exe.bind(data_shapes=[('data', (1, 10, 10))], inputs_need_grad=True)
# exe.init_params()
# exe.init_optimizer()
# data1 = [mx.nd.ones((1, 10, 10))]
# exe.forward(Batch(data1))

# print exe.get_outputs()[0].asnumpy().shape
# top_grads =np.random.random(size=(1, 10, 10, 10))
# exe.backward(out_grads=top_grads)
# print exe.get_input_grads()[0].asnumpy()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
以上爲定義的tile操做，這裏沒有作徹底的tile操做，只是能夠在最後的axis進行數據的tile操做。forward中用numpy.tile實現，backward中參考caffe
中的TileLayer實現，這裏代碼運行結果：　　

INFO:root:arg_shape:[(2L, 3L)]
out_shape:[(2L, 3L, 10L)]
aux_shape:[]
output_blob:['tileop_output']
1
2
3
4
上述代碼由於在list_arguments中定義了形參名字爲input,所以在使用是形參必須爲input,結果中也能夠看到，infer_shape以及list_output的結果，基本細節就是上述。　　

在咱們定義好Op後，咱們須要經過mx.mod.Moudle()將Op進行整合，並經過bind()來申請內存，在此以後，咱們能夠經過如下兩種方法訓練它：　　

分別調用init_params()初始化參數(固然這裏沒有參數須要初始化)，init_optimazer()初始化optimazer,接下來就能夠經過forward()和backward()進行前向反向傳播訓練模塊。　　或者直接調用fit()方法進行訓練，由於fit()中包含初始化操做。　　關於Moudle能夠參看mx.mod.Module--------------------- 做者：我只是空氣來源：CSDN 原文：https://blog.csdn.net/u011765306/article/details/54562282 版權聲明：本文爲博主原創文章，轉載請附上博文連接！

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。