implementation:
最近實踐心得: COCOB這個優化器,發如今某些網絡的訓練上面很是的不穩定,仍是設置Adam比較靠譜。。。。網絡
lr cosine 的學習率的設計方式通常和momentum optimizer搭配~~~~~ide
記錄一下本身給跪的理解力:學習
tf.nn.seperable_conv2d(inputs, depthwise_filter, pointwise_filter) depthwise_filter: [filter_height, filter_width, in_channels, channel_multiplier] pointwise_filter: [1, 1, channel_multiplier * in_channels, out_channels] output[b, i, j, k] = sum_{di, dj, q, r} input[b, strides[1] * i + di, strides[2] * j + dj, q] * depthwise_filter[di, dj, q, r] * pointwise_filter[0, 0, q * channel_multiplier + r, k] 因此其實這邊對應的操做就是depthwise_filter的第i層只對input的第i層操做。