A stochastic approach to Bi-Level optimization for hyperparameter optimization and meta learning

Minyoung Kim, Timothy Hospedales

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning, including hyperparameter optimization, loss function learning, few-shot learning, invariance learning and more. These problems are often formalized as Bi-Level optimizations (BLO). We introduce a novel perspective by turning a given BLO problem into a stochastic optimization, where the inner loss function becomes a smooth probability distribution, and the outer loss becomes an expected loss over the inner distribution. To solve this stochastic optimization, we adopt Stochastic Gradient Langevin Dynamics (SGLD) MCMC to sample inner distribution, and propose a recurrent algorithm to compute the MC-estimated hypergradient. Our derivation is similar to forward-mode differentiation, but we introduce a new first-order approximation that makes it feasible for large models without needing to store huge Jacobian matrices. The main benefits are two-fold: i) Our stochastic formulation takes into account uncertainty, which makes the method robust to suboptimal inner optimization or non-unique multiple inner minima due to overparametrization; ii) Compared to existing methods that often exhibit unstable behavior and hyperparameter sensitivity in practice, our method leads to considerably more reliable solutions. We demonstrate that the new approach achieves promising results on diverse meta learning problems and easily scales to learning 87M hyperparameters in the case of Vision Transformers.
Original languageEnglish
Title of host publicationProceedings of the 39th Annual AAAI Conference on Artificial Intelligence
EditorsToby Walsh, Julie Shah, Zico Kolter
Place of PublicationWashington, DC, USA
PublisherAAAI Press
Pages17913-17920
Number of pages8
ISBN (Electronic)9781577358978
DOIs
Publication statusPublished - 11 Apr 2025
EventThe 39th Annual AAAI Conference on Artificial Intelligence - Pennsylvania Convention Center, Philadelphia, United States
Duration: 25 Feb 20254 Mar 2025
Conference number: 39
https://aaai.org/conference/aaai/aaai-25/

Publication series

Name Proceedings of the AAAI Conference on Artificial Intelligence
PublisherAAAI Press
Number17
Volume39
ISSN (Print)2159-5399
ISSN (Electronic)2374-3468

Conference

ConferenceThe 39th Annual AAAI Conference on Artificial Intelligence
Abbreviated titleAAAI-25
Country/TerritoryUnited States
CityPhiladelphia
Period25/02/254/03/25
Internet address

Keywords / Materials (for Non-textual outputs)

  • machine learning

Fingerprint

Dive into the research topics of 'A stochastic approach to Bi-Level optimization for hyperparameter optimization and meta learning'. Together they form a unique fingerprint.

Cite this