Multirate Training of Neural Networks

Tiffany Vlaar*, Benedict Leimkuhler

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

We propose multirate training of neural networks: partitioning neural network parameters into “fast” and “slow” parts which are trained on different time scales, where slow parts are updated less frequently. By choosing appropriate partitionings we can obtain substantial computational speed-up for transfer learning tasks. We show for applications in vision and NLP that we can fine-tune deep neural networks in almost half the time, without reducing the generalization performance of the resulting models. We analyze the convergence properties of our multirate scheme and draw a comparison with vanilla SGD. We also discuss splitting choices for the neural network parameters which could enhance generalization performance when neural networks are trained from scratch. A multirate approach can be used to learn different features present in the data and as a form of regularization. Our paper unlocks the potential of using multirate techniques for neural network training and provides several starting points for future work in this area.

Original languageEnglish
Title of host publicationInternational Conference on Machine Learning, 2022
PublisherML Research Press
Pages22342-22360
Number of pages19
Publication statusPublished - 24 Jul 2024
Event39th International Conference on Machine Learning, ICML 2022 - Baltimore, United States
Duration: 17 Jul 202223 Jul 2022

Publication series

NameProceedings of Machine Learning Research
Volume162
ISSN (Print)2640-3498

Conference

Conference39th International Conference on Machine Learning, ICML 2022
Country/TerritoryUnited States
CityBaltimore
Period17/07/2223/07/22

Fingerprint

Dive into the research topics of 'Multirate Training of Neural Networks'. Together they form a unique fingerprint.

Cite this