Interactive visualisation techniques for dynamic speech transcription, correction and training

Saturnino Luz*, Masood Masoodian, Bill Rogers

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

As performance gains in automatic speech recognition systems plateau, improvements to existing applications of speech recognition technology seem more likely to come from better user interface design than from further progress in core recognition components. Among all applications of speech recognition, the usability of systems for transcription of spontaneous speech is particularly sensitive to high word error rates. This paper presents a series of approaches to improving the usability of such applications. We propose new mechanisms for error correction, use of contextual information, and use of 3D visualisation techniques to improve user interaction with a recogniser and maximise the impact of user feedback. These proposals are illustrated through several prototypes which target tasks such as: off-line transcript editing, dynamic transcript editing, and real-time visualisation of recognition paths. An evaluation of our dynamic transcript editing system demonstrates the gains that can be made by adding the corrected words to the recogniser's dictionary and then propagating the user's corrections.

Original languageEnglish
Title of host publicationProceedings of the 9th ACM SIGCHI New Zealand Chapter's International Conference on Human-Computer Interaction
Subtitle of host publicationDesign Centered HCI, CHINZ 2008
Pages9-16
Number of pages8
DOIs
Publication statusPublished - Aug 2008
EventACM SIGCHI New Zealand Chapter's International Conference on Human-Computer Interaction: Design Centered HCI, CHINZ 2008 - Wellington, New Zealand
Duration: 2 Jul 20082 Jul 2008

Publication series

NameACM International Conference Proceeding Series

Conference

ConferenceACM SIGCHI New Zealand Chapter's International Conference on Human-Computer Interaction: Design Centered HCI, CHINZ 2008
Country/TerritoryNew Zealand
CityWellington
Period2/07/082/07/08

Keywords / Materials (for Non-textual outputs)

  • Automatic Speech Transcription
  • Error-correction
  • Semi-automatic Speech Transcription
  • Speech Recogniser Training

Fingerprint

Dive into the research topics of 'Interactive visualisation techniques for dynamic speech transcription, correction and training'. Together they form a unique fingerprint.

Cite this