Non-Parallel Voice Conversion Using I-Vector PLDA: Towards Unifying Speaker Verification and Transformation

Tomi Kinnunen, Lauri Juvela, Paavo Alku, Junichi Yamagishi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Text-independent speaker verification (recognizing speakers regardless of content) and non-parallel voice conversion (transforming voice identities without requiring content-matched training utterances) are related problems. We adopt i-vector method to voice conversion. An i-vector is a fixed-dimensional representation of a speech utterance that enables treating voice conversion in utterance domain, as opposed to frame domain. The high dimensionality (800) and small number of training utterances (24) necessitates using prior information of speakers. We adopt probabilistic linear discriminant analysis (PLDA) for voice conversion. The proposed approach requires neither parallel utterances, transcriptions nor time alignment procedures at any stage.
Original languageEnglish
Title of host publication2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages5535-5539
Number of pages5
ISBN (Electronic)978-1-5090-4117-6
DOIs
Publication statusPublished - 19 Jun 2017
Event42nd IEEE International Conference on Acoustics, Speech and Signal Processing - New Orleans, United States
Duration: 5 Mar 20179 Mar 2017
http://www.ieee-icassp2017.org/

Publication series

Name
PublisherIEEE
ISSN (Electronic)2379-190X

Conference

Conference42nd IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2017
CountryUnited States
CityNew Orleans
Period5/03/179/03/17
Internet address

Fingerprint Dive into the research topics of 'Non-Parallel Voice Conversion Using I-Vector PLDA: Towards Unifying Speaker Verification and Transformation'. Together they form a unique fingerprint.

Cite this