High-Quality Nonparallel Voice Conversion Based On Cycle-Consistent Adversarial Network

Fuming Fang, Junichi Yamagishi, Isao Echizen, Jaime Lorenzo-Trueba

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Although voice conversion (VC) algorithms have achieved remarkable success along with the development of machine learning, superior performance is still difficult to achieve when using nonparallel data. In this paper, we propose using a cycle-consistent adversarial network (CycleGAN) for nonparallel data-based VC training. A CycleGAN is a generative adversarial network (GAN) originally developed for unpaired image-to-image translation. A subjective evaluation of inter-gender conversion demonstrated that the proposed method significantly outperformed a method based on the Merlin open source neural network speech synthesis system (a parallel VC system adapted for our setup) and a GAN-based parallel VC system. This is the first research to show that the performance of a nonparallel VC method can exceed that of state-of-the-art parallel VC methods.
Index Terms— Voice conversion, deep learning, cycle-consistent adversarial network, generative adversarial network
Original languageEnglish
Title of host publication2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
Subtitle of host publicationCalgary, AB, Canada
Place of PublicationCalgary, Alberta, Canada
PublisherInstitute of Electrical and Electronics Engineers
Pages5279-5283
Number of pages5
ISBN (Electronic)978-1-5386-4658-8
ISBN (Print)978-1-5386-4659-5
DOIs
Publication statusPublished - 13 Sept 2018
Event2018 IEEE International Conference on Acoustics, Speech and Signal Processing - Calgary, Canada
Duration: 15 Apr 201820 Apr 2018
https://2018.ieeeicassp.org/
https://2018.ieeeicassp.org/default.asp

Publication series

Name
PublisherIEEE
ISSN (Electronic)2379-190X

Conference

Conference2018 IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2018
Country/TerritoryCanada
CityCalgary
Period15/04/1820/04/18
Internet address

Fingerprint

Dive into the research topics of 'High-Quality Nonparallel Voice Conversion Based On Cycle-Consistent Adversarial Network'. Together they form a unique fingerprint.

Cite this