Compressing deep convolutional networks using vector quantizationui
Quantized convolutional neural networks for mobile deviceslua
Binaryconnect: Training deep neural networks with binary weights during propagationsget
Binarynet: Training deep neural net- works with weights and activations constrained to +1 or -1hash
Xnor-net: Imagenet classification using binary convolutional neural networksit
Deep neural networks are robust to weight binarization and other non- linear distortionsio
Comparing biases for minimal network construction with back-propagation
Second order derivatives for network pruning: Optimal brain surgeon
Learning both weights and connections for efficient neural networks
Exploiting linear structure within convolutional networks for efficient evaluation
Speeding up convolutional neural networks with low rank expansions
Speeding-up convolutional neural networks using fine-tuned cp- decomposition
Low-rank matrix factorization for deep neural network training with high-dimensional output targets
Understanding and improving convolutional neural networks via concatenated rectified linear units
Inception-v4, inception-resnet and the impact of residual connections on learning
SQUEEZENET: ALEXNET-LEVEL ACCURACY WITH 50X FEWER PARAMETERS AND <0.5MB MODEL SIZE
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications
Outrageously large neural networks: The sparsely- gated mixture-of-experts layer
Deep dynamic neural networks for multimodal gesture segmentation and recognition
Deep pyramidal residual networks with separated stochastic depth