Detecting High Level Dialog Structure Without Lexical Information

M. P. Aylett

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The potentially enormous audio resources now available to both organizations, and on the Internet, present a serious challenge to audio browsing technology. In this paper we outline a set of techniques that can be used to determine high level dialog structure without the requirement of resource intensive automatic speech recognition (ASR). Using syllable finding algorithms based on band pass energy together with prosodic feature extraction, we show that a sub-lexical approach to prosodic analysis can outperform results based on ASR and even those based on a word alignment which requires a complete transcription. We consider how these techniques could be integrated into ASR technology and suggest a framework for extending this type of sub-lexical prosodic analysis
Original languageEnglish
Title of host publication2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings
PublisherInstitute of Electrical and Electronics Engineers
PagesI-I
Volume1
ISBN (Print)1-4244-0469-X
DOIs
Publication statusPublished - 1 May 2006

Fingerprint

Dive into the research topics of 'Detecting High Level Dialog Structure Without Lexical Information'. Together they form a unique fingerprint.

Cite this