Projects per year
Abstract
Vision-language models such as CLIP are pretrained on large volumes of internet sourced image and text pairs, and have been shown to sometimes exhibit impressive zero- and low-shot image classification performance. However, due to their size, fine-tuning these models on new datasets can be prohibitively expensive, both in terms of the supervision and compute required. To combat this, a series of light-weight adaptation methods have been proposed to efficiently adapt such models when limited supervision is available. In this work, we show that while effective on internet-style datasets, even those remedies under-deliver on classification tasks with images that differ significantly from those commonly found online. To address this issue, we present a new approach called SVL-Adapter that combines the complementary strengths of both vision-language pretraining and self-supervised representation learning. We report an average classification accuracy improvement of 10% in the low-shot setting when compared to existing methods, on a set of challenging visual classification tasks. Further, we present a fully automatic way of selecting an important blending hyperparameter for our model that does not require any held-out labeled validation data.
Code for our project is available here: https://github.com/omipan/svl_adapter.
Code for our project is available here: https://github.com/omipan/svl_adapter.
Original language | English |
---|---|
Title of host publication | Proceedings of The 33rd British Machine Vision Conference (BMVC 2022) |
Publisher | BMVA Press |
Number of pages | 23 |
Publication status | Published - 25 Nov 2022 |
Event | The 33rd British Machine Vision Conference, 2022 - London, United Kingdom Duration: 21 Nov 2022 → 24 Nov 2022 Conference number: 33 https://www.bmvc2022.org/ |
Conference
Conference | The 33rd British Machine Vision Conference, 2022 |
---|---|
Abbreviated title | BMVC 2022 |
Country/Territory | United Kingdom |
City | London |
Period | 21/11/22 → 24/11/22 |
Internet address |
Fingerprint
Dive into the research topics of 'SVL-Adapter: Self-Supervised Adapter for Vision-Language Pretrained Models'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Enabling advanced autonomy through human-AI collaboration
Fisher, B. (Principal Investigator), Bilen, H. (Co-investigator), Keller, F. (Co-investigator), Lascarides, A. (Co-investigator), Mac Aodha, O. (Co-investigator), Mollica, F. (Co-investigator), N, S. (Co-investigator), Ramamoorthy, R. (Co-investigator), Rovatsos, M. (Co-investigator) & Sevilla-Lara, L. (Co-investigator)
1/10/21 → 30/06/22
Project: Research