Fast, low-artifact speech synthesis considering global variance

M. Shannon, W. Byrne

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Speech parameter generation considering global variance (GV generation) is widely acknowledged to dramatically improve the quality of synthetic speech generated by HMM-based systems. However it is slower and has higher latency than the standard speech parameter generation algorithm. In addition it is known to produce artifacts, though existing approaches to prevent artifacts are effective. We present a simple new theoretical analysis of speech parameter generation considering global variance based on Lagrange multipliers. This analysis sheds light on one source of artifacts and suggests a way to reduce their occurrence. It also suggests an approximation to exact GV generation that allows fast, low latency synthesis. In a subjective evaluation our fast approximation shows no degradation in naturalness compared to conventional GV generation.
Original languageEnglish
Title of host publicationAcoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Pages7869-7873
Number of pages5
ISBN (Print)978-1-4799-0356-6
DOIs
Publication statusPublished - 2013

Keywords

  • hidden Markov models
  • speech synthesis
  • GV generation
  • HMM-based systems
  • Lagrange multipliers
  • global variance
  • low-artifact speech synthesis
  • standard speech parameter generation algorithm
  • subjective evaluation
  • synthetic speech quality
  • Abstracts
  • Educational institutions
  • Hidden Markov models
  • Vectors
  • Speech synthesis
  • artifact
  • low latency
  • speech parameter generation considering global variance

Fingerprint Dive into the research topics of 'Fast, low-artifact speech synthesis considering global variance'. Together they form a unique fingerprint.

Cite this