Abstract
We propose a novel hierarchical Bayesian model for the few-shot meta learning problem. We consider episode-wise random variables to model episode-specific generative processes, where these local random variables are governed by a higher-level global random variable. The global variable captures information shared across episodes, while controlling how much the model needs to be adapted to new episodes in a principled Bayesian manner. Within our framework, prediction on a novel episode/task can be seen as a Bayesian inference problem. For tractable training, we need to be able to relate each local episode-specific solution to the global higher-level parameters. We propose a Normal-Inverse-Wishart model, for which establishing this local-global relationship becomes feasible due to the approximate closed-form solutions for the local posterior distributions. The resulting algorithm is more attractive than the MAML in that it does not maintain a costly computational graph for the sequence of gradient descent steps in an episode. Our approach is also different from existing Bayesian meta learning methods in that rather than modeling a single random variable for all episodes, it leverages a hierarchical structure that exploits the local-global relationships desirable for principled Bayesian learning with many related tasks.
Original language | English |
---|---|
Title of host publication | Proceedings of The Twelfth International Conference on Learning Representations |
Number of pages | 28 |
Publication status | Accepted/In press - 16 Jan 2024 |
Event | The Twelfth International Conference on Learning Representations - Vienna, Austria Duration: 7 May 2024 → 11 May 2024 https://iclr.cc/ |
Conference
Conference | The Twelfth International Conference on Learning Representations |
---|---|
Abbreviated title | ICLR 2024 |
Country/Territory | Austria |
City | Vienna |
Period | 7/05/24 → 11/05/24 |
Internet address |
Keywords / Materials (for Non-textual outputs)
- Bayesian models
- Meta learning
- Few-shot learning