Non linear time compression of clear and normal speech at high rates

Cassia Valentini-Botinhao, Mirjam Wester, Junichi Yamagishi, Markus Toman, Michael Pucher, Dietmar Schabus

Research output: Working paper

Abstract

We compare a series of time compression methods applied to normal and clear speech. First we evaluate a linear (uniform) method applied to these styles as well as to naturally-produced fast speech. We found, in line with the literature, that unprocessed fast speech was less intelligible than linearly compressed normal speech. Fast speech was also less intelligible than compressed clear speech but at the highest rate (three times faster than normal) the advantage of clear over fast speech was lost. To test whether this was due to shorter speech duration we evaluate, in our second experiments, a range of methods that compress speech and silence at different rates. We found that even when the overall duration of speech and silence is kept the same across styles, compressed normal speech is still more intelligible than compressed clear speech. Compressing silence twice as much as speech improved results further for normal speech with very little additional computational costs.
Original languageEnglish
PublisherarXiv.org
Number of pages5
Publication statusPublished - 1 Jan 2019

Keywords

  • Electrical Engineering and Systems Science - Audio and Speech Processing

Fingerprint

Dive into the research topics of 'Non linear time compression of clear and normal speech at high rates'. Together they form a unique fingerprint.

Cite this