Substituting Convolutions for Neural Network Compression

Elliot J Crowley, Gavin Gray, Jack Turner, Amos J Storkey

Research output: Contribution to journalArticlepeer-review

Abstract

Many practitioners would like to deploy deep, convolutional neural networks in memory-limited scenarios, e.g., on an embedded device. However, with an abundance of compression techniques available it is not obvious how to proceed; many bring with them additional hyperparameter tuning, and are specific to particular network types. In this paper, we propose a simple compression technique that is general, easy to apply, and requires minimal tuning. Given a large, trained network, we propose (i) substituting its expensive convolutions with cheap alternatives, leaving the overall architecture unchanged; (ii) treating this new network as a student and training it with the original as a teacher through distillation. We demonstrate this approach separately for (i) networks predominantly consisting of full 3×3 convolutions and (ii) 1×1 or pointwise convolutions which together make up the vast majority of contemporary networks. We are able to leverage a number of methods that have been developed as efficient alternatives to fully-connected layers for pointwise substitution, allowing us provide Pareto-optimal benefits in efficiency/accuracy.
Original languageEnglish
Pages (from-to)83199 - 83213
JournalIEEE Access
Volume9
Early online date4 Jun 2021
DOIs
Publication statusE-pub ahead of print - 4 Jun 2021

Keywords / Materials (for Non-textual outputs)

  • Machine learning
  • Deep Neural Networks
  • Computer Vision
  • DNN compression

Fingerprint

Dive into the research topics of 'Substituting Convolutions for Neural Network Compression'. Together they form a unique fingerprint.

Cite this