Evaluating speech synthesis intelligibility using Amazon Mechanical Turk

Maria K. Wolters, Karl B. Isaac, Steve Renals

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Microtask platforms such as Amazon Mechanical Turk (AMT) are increasingly used to create speech and language resources. AMT in particular allows researchers to quickly recruit a large number of fairly demographically diverse participants. In this study, we investigated whether AMT can be used for comparing the intelligibility of speech synthesis systems. We conducted two experiments in the lab and via AMT, one comparing US English diphone to US English speaker-adaptive HTS synthesis and one comparing UK English unit selection to UK English speaker-dependent HTS synthesis. While AMT word error rates were worse than lab error rates, AMT results were more sensitive to relative differences between systems. This is mainly due to the larger number of listeners. Boxplots and multilevel modelling allowed us to identify listeners who performed particularly badly, while thresholding was sufficient to eliminate rogue workers. We conclude that AMT is a viable platform for synthetic speech intelligibility comparisons.
Original languageEnglish
Title of host publicationProc. 7th Speech Synthesis Workshop (SSW7)
Number of pages6
Publication statusPublished - 2010


Dive into the research topics of 'Evaluating speech synthesis intelligibility using Amazon Mechanical Turk'. Together they form a unique fingerprint.

Cite this