BOOTSTRAP PREDICTIVE CODING: INVESTIGATING A NON-CONTRASTIVE SELF-SUPERVISED LEARNING APPROACH

Yumnah Mohamied, Peter Bell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Self-supervised learning methods (SSL) have seen wide popularity for speech representation learning. Early methods, such as wav2vec, were causal, whilst more recent approaches, notably wav2vec 2.0 and data2vec, have employed masking strategies together with a Transformer architecture. Many SSL methods use contrastive learning; however, non-contrastive methods, while susceptible to representational collapse, have recently seen success in other fields and have been successfully applied to speech in data2vec. This work returns to the study of causal models, comparing them with equivalent non-causal variants, motivated by our observation that non-contrastive SSL models have never been investigated in this setting. To this end, we propose a novel approach, Bootstrapped Predictive Coding (BPC), a causal non-contrastive SSL model. We find that causal SSL models outperform their non-causal counterparts in both contrastive and non-contrastive training setups, and that representations obtained with BPC give overall best performance when evaluated on phone frame classification.
Original languageEnglish
Title of host publicationICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherIEEE
Pages11541-11545
Number of pages5
ISBN (Electronic)979-8-3503-4485-1
DOIs
Publication statusPublished - 18 Mar 2024
Event2024 IEEE International Conference on Acoustics, Speech and Signal Processing - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024
https://2024.ieeeicassp.org/

Publication series

NameInternational Conference on Acoustics, Speech, and Signal Processing (ICASSP)
PublisherIEEE
ISSN (Print)1520-6149
ISSN (Electronic)2379-190X

Conference

Conference2024 IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2024
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24
Internet address

Keywords / Materials (for Non-textual outputs)

  • self-supervised learning
  • speech recognition
  • non-contrastive methods
  • masked acoustic modelling

Fingerprint

Dive into the research topics of 'BOOTSTRAP PREDICTIVE CODING: INVESTIGATING A NON-CONTRASTIVE SELF-SUPERVISED LEARNING APPROACH'. Together they form a unique fingerprint.

Cite this