- 這裏暫時看各類框架api實現,相比於普通卷積(卷積操做詳解)的高效實現,分組卷積怎麼高效實現待研究!html
- A normal convolutional layer. Yellow blocks represent learned parameters, gray blocks represent feature maps/input images (working memory).git
- A convolutional layer with 2 filter groups. Note that each of the filters in the grouped convolutional layer is now exactly half the depth, i.e. half the parameters and half the compute as the original filter.github
tensorflow實現:api
depthwise_conv2d_native: for k in 0..in_channels-1 for q in 0..channel_multiplier-1 output[b, i, j, k * channel_multiplier + q] = sum_{di, dj} input[b, strides[1] * i + di, strides[2] * j + dj, k] * filter[di, dj, k, q] depthwise_conv2d: output[b, i, j, k * channel_multiplier + q] = sum_{di, dj} filter[di, dj, k, q] * input[b, strides[1] * i + rate[0] * di, strides[2] * j + rate[1] * dj, k] conv2d: output[b, i, j, k] = sum_{di, dj, q} input[b, strides[1] * i + di, strides[2] * j + dj, q] * filter[di, dj, q, k]
- 普通卷積和深度可分離卷積實現,使用depthwise_conv2d_native原始的貌似能夠輔助實現分組卷積,速度慢;depthwise_conv2d使用矩陣實現;app
pytorch實現:框架
- torch.nn.
Conv2d
(in_channels, out_channels, kernel_size, stride=1, padding=0, dilation=1, groups=1, bias=True, padding_mode='zeros')ide
- groups
controls the connections between inputs and outputs. in_channels
and out_channels
must both be divisible by groups
. For example,post
At groups=1, all inputs are convolved to all outputs.ui
At groups=2, the operation becomes equivalent to having two conv layers side by side, each seeing half the input channels, and producing half the output channels, and both subsequently concatenated.url
At groups=
in_channels
, each input channel is convolved with its own set of filters, of size: \left\lfloor\frac{out\_channels}{in\_channels}\right\rfloor⌊in_channelsout_channels⌋.
caffe實現:
layer { name: "conv1" type: "Convolution" bottom: "data" top: "conv1" # learning rate and decay multipliers for the filters param { lr_mult: 1 decay_mult: 1 } # learning rate and decay multipliers for the biases param { lr_mult: 2 decay_mult: 0 } convolution_param { num_output: 96 # learn 96 filters kernel_size: 11 # each filter is 11x11 stride: 4 # step 4 pixels between each filter application weight_filler { type: "gaussian" # initialize the filters from a Gaussian std: 0.01 # distribution with stdev 0.01 (default mean: 0) } bias_filler { type: "constant" # initialize the biases to zero (0) value: 0 } } }
Optional
bias_term
[default true
]: specifies whether to learn and apply a set of additive biases to the filter outputspad
(or pad_h
and pad_w
) [default 0]: specifies the number of pixels to (implicitly) add to each side of the inputstride
(or stride_h
and stride_w
) [default 1]: specifies the intervals at which to apply the filters to the inputgroup
(g) [default 1]: If g > 1, we restrict the connectivity of each filter to a subset of the input. Specifically, the input and output channels are separated into g groups, and the iith output group channels will be only connected to the iith input group channels.