On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years, self-supervised learning has excelled for its capacity to learn robust feature representations from unlabelled data. Networks pretrained through self-supervision serve as effective feature extractors for downstream tasks, including Few-Shot Learning. While the evaluation of unsupervised approaches for few-shot learning is well-established in imagery, it is notably absent in acoustics. This study addresses this gap by assessing large-scale self-supervised models’ performance in few-shot audio classification. Additionally, we explore the relationship between a model’s few-shot learning capability and other downstream task benchmarks. Our findings reveal state-of-the-art performance in some few-shot problems such as SpeechCommandsv2, as well as strong correlations between speech-based few-shot problems and various downstream audio tasks.
Original languageEnglish
Title of host publicationIEEE International Conference on Acoustics, Speech, and Signal Processing Workshops (ICASSPW),
PublisherInstitute of Electrical and Electronics Engineers
DOIs
Publication statusPublished - 15 Aug 2024

Fingerprint

Dive into the research topics of 'On the Transferability of Large-Scale Self-Supervision to Few-Shot Audio Classification'. Together they form a unique fingerprint.

Cite this