Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification

Sayaka Shiota, Fernando Villavicencio, Junichi Yamagishi, Nobutaka Ono, Isao Echizen, Tomoko Matsui

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

This paper proposes a novel countermeasure framework to detect spoofing attacks to reduce the vulnerability of automatic speaker verification (ASV) systems. Recently, ASV systems have reached equivalent performances equivalent to those of other biometric modalities. However, spoofing techniques against these systems have also progressed drastically. Experimentation using advanced speech synthesis and voice conversion techniques has showed unacceptable false acceptance rates and several new countermeasure algorithms have been explored to detect spoofing materials accurately. However, the countermeasures proposed so far are based on the acoustic differences between natural speech signals and artificial speech signals, expected to become gradually smaller in the near future. In this paper, we focus on voice liveness detection, which aims to validate whether the presented speech signals originated from a live human. We use the phenomenon of pop noise, which is a distortion that happens when human breath reaches a microphone,as liveness evidence. This paper proposes pop noise detection algorithms and shows through an experimental study that they can be used to discriminate live voice signals from artificial ones generated by means of speech synthesis techniques.
Original languageEnglish
Title of host publicationINTERSPEECH 2015 16th Annual Conference of the International Speech Communication Association
PublisherInternational Speech Communication Association
Pages239-243
Number of pages5
Publication statusPublished - Sept 2015

Fingerprint

Dive into the research topics of 'Voice liveness detection algorithms based on pop noise caused by human breath for automatic speaker verification'. Together they form a unique fingerprint.

Cite this