Abstract
Speaker recognition is a key component for emerging Internet of Things (IoT) smart services, such as voice-control and personalized applications. Although speaker recognition systems can attain excellent performance on synthetic datasets, operation in the real-world can lead to a significant degradation in performance. The key reason for this is the lack of enough labeled datasets for model adaptation, primarily due to the cost of manual annotation and enrollment. A recent solution to this problem is to use cross-modal identifiers e.g. WiFi sniffing to gradually associate an identity with a certain vocal feature e.g. Simultaneous Clustering and Naming (SCAN). In this paper we demonstrate how to further improve performance of these cross-modal systems in the wild by iteratively adapting the feature extractor based on the output of the noisy association and clustering step. We show how this feedback loop can not only improve overall accuracy, but also labeling coverage in association result. iSCAN is a further step towards a robust and zero-effort speaker recognition system for the IoT.
Original language | English |
---|---|
Title of host publication | Adjunct Proceedings of the 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2019 ACM International Symposium on Wearable Computers |
Place of Publication | New York, NY, USA |
Publisher | ACM Association for Computing Machinery |
Pages | 529–533 |
Number of pages | 5 |
ISBN (Print) | 9781450368698 |
DOIs | |
Publication status | Published - 9 Sep 2019 |
Event | The 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing, colocated with ISWC 2019. - London, United Kingdom Duration: 9 Sep 2019 → 13 Sep 2019 https://ubicomp.org/ubicomp2019/home.html |
Conference
Conference | The 2019 ACM International Joint Conference on Pervasive and Ubiquitous Computing, colocated with ISWC 2019. |
---|---|
Abbreviated title | UbiComp 2019 |
Country/Territory | United Kingdom |
City | London |
Period | 9/09/19 → 13/09/19 |
Internet address |
Keywords
- internet of things
- speaker adaptation
- automatic utterance labeling