Detection of synthetic speech for the problem of imposture

P.L. De Leon, I. Hernaez, I. Saratxaga, M. Pucher, J. Yamagishi

Research output: Chapter in Book/Report/Conference proceedingConference contribution


In this paper, we present new results from our research into the vulnerability of a speaker verification (SV) system to synthetic speech. We use a HMM-based speech synthesizer, which creates synthetic speech for a targeted speaker through adaptation of a background model and both GMM-UBM and support vector machine (SVM) SV systems. Using 283 speakers from the Wall-Street Journal (WSJ) corpus, our SV systems have a 0.35% EER. When the systems are tested with synthetic speech generated from speaker models derived from the WSJ journal corpus, over 91% of the matched claims are accepted. We propose the use of relative phase shift (RPS) in order to detect synthetic speech and develop a GMM-based synthetic speech classifier (SSC). Using the SSC, we are able to correctly classify human speech in 95% of tests and synthetic speech in 88% of tests thus significantly reducing the vulnerability.
Original languageEnglish
Title of host publicationAcoustics, Speech and Signal Processing (ICASSP)
Subtitle of host publication2011 IEEE International Conference
Number of pages4
Publication statusPublished - 1 May 2011


  • hidden Markov models
  • speaker recognition
  • speech synthesis
  • support vector machines
  • EER
  • GMM-based synthetic speech classifier
  • HMM-based speech synthesizer
  • RPS
  • SSC
  • SV system
  • WSJ corpus
  • Wall-Street Journal corpus
  • relative phase shift
  • speaker verification system
  • support vector machine
  • Adaptation models
  • Harmonic analysis
  • Hidden Markov models
  • Humans
  • Speech
  • Support vector machines
  • Training
  • Security
  • Speaker recognition
  • Speech synthesis


Dive into the research topics of 'Detection of synthetic speech for the problem of imposture'. Together they form a unique fingerprint.

Cite this