TY - GEN
T1 - Interactive visualisation techniques for dynamic speech transcription, correction and training
AU - Luz, Saturnino
AU - Masoodian, Masood
AU - Rogers, Bill
PY - 2008/8
Y1 - 2008/8
N2 - As performance gains in automatic speech recognition systems plateau, improvements to existing applications of speech recognition technology seem more likely to come from better user interface design than from further progress in core recognition components. Among all applications of speech recognition, the usability of systems for transcription of spontaneous speech is particularly sensitive to high word error rates. This paper presents a series of approaches to improving the usability of such applications. We propose new mechanisms for error correction, use of contextual information, and use of 3D visualisation techniques to improve user interaction with a recogniser and maximise the impact of user feedback. These proposals are illustrated through several prototypes which target tasks such as: off-line transcript editing, dynamic transcript editing, and real-time visualisation of recognition paths. An evaluation of our dynamic transcript editing system demonstrates the gains that can be made by adding the corrected words to the recogniser's dictionary and then propagating the user's corrections.
AB - As performance gains in automatic speech recognition systems plateau, improvements to existing applications of speech recognition technology seem more likely to come from better user interface design than from further progress in core recognition components. Among all applications of speech recognition, the usability of systems for transcription of spontaneous speech is particularly sensitive to high word error rates. This paper presents a series of approaches to improving the usability of such applications. We propose new mechanisms for error correction, use of contextual information, and use of 3D visualisation techniques to improve user interaction with a recogniser and maximise the impact of user feedback. These proposals are illustrated through several prototypes which target tasks such as: off-line transcript editing, dynamic transcript editing, and real-time visualisation of recognition paths. An evaluation of our dynamic transcript editing system demonstrates the gains that can be made by adding the corrected words to the recogniser's dictionary and then propagating the user's corrections.
KW - Automatic Speech Transcription
KW - Error-correction
KW - Semi-automatic Speech Transcription
KW - Speech Recogniser Training
UR - http://www.scopus.com/inward/record.url?scp=70349098534&partnerID=8YFLogxK
U2 - 10.1145/1496976.1496978
DO - 10.1145/1496976.1496978
M3 - Conference contribution
AN - SCOPUS:70349098534
SN - 9781605584676
T3 - ACM International Conference Proceeding Series
SP - 9
EP - 16
BT - Proceedings of the 9th ACM SIGCHI New Zealand Chapter's International Conference on Human-Computer Interaction
T2 - ACM SIGCHI New Zealand Chapter's International Conference on Human-Computer Interaction: Design Centered HCI, CHINZ 2008
Y2 - 2 July 2008 through 2 July 2008
ER -