Edinburgh Research Explorer

A Lattice-based Approach to Automatic Filled Pause Insertion

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Original languageEnglish
Title of host publicationProc. of DiSS 2015, The 7th Workshop on Disfluencies in Spontaneous Speech
Place of PublicationEdinburgh
Number of pages4
Publication statusPublished - 10 Aug 2015

Abstract

This paper describes a novel method for automatically inserting filled pauses (e.g., UM) into fluent texts. Although filled pauses are known to serve a wide range of psychological and structural functions in conversational speech, they have not traditionally been modelled overtly by state-of-the-art speech synthesis systems. However, several recent systems have started to model disfluencies specifically, and so there is an increasing need to create disfluent speech synthesis input by automatically inserting filled pauses into otherwise fluent text. The approach presented here interpolates Ngrams and Full-Output Recurrent Neural Network Language Models (f-RNNLMs) in a lattice-rescoring framework. It is shown that the interpolated system outperforms separate Ngram and f-RNNLM systems, where performance is analysed using the Precision, Recall, and F-score metrics.

    Research areas

  • Disfluency, Filled Pauses, f-RNNLMs, Ngrams, Lattices

Download statistics

No data available

ID: 21134311