Description
./README => short description of dataset
./stimuli/ => utterances (used in listening test) synthesised by 3 unit selection voices
./responses/ => listener responses for 3 preference tests
./stimuli/ => utterances (used in listening test) synthesised by 3 unit selection voices
./responses/ => listener responses for 3 preference tests
Abstract
This is the listening test data for the experiment presented in the ICASSP 2016 paper "Smooth Talking: Articulatory Join Costs for Unit Selection", which proposes and evaluates computation of unit selection join costs in the articulatory domain. Join cost calculation has so far dealt exclusively with acoustic speech parameters, and a large number of distance metrics have previously been tested in conjunction with a wide variety of acoustic parameterisations. In contrast, we propose here to calculate distance in articulatory space. The motivation for this is simple: physical constraints mean a human talker's mouth cannot ``jump'' from one configuration to a different one, so smooth evolution of articulator positions would also seem desirable for a good candidate unit sequence. To test this, we built Festival Multisyn voices using a large articulatory-acoustic dataset. We first synthesised 460 TIMIT sentences and confirmed our articulatory join cost gives appreciably different unit sequences compared to the standard Multisyn acoustic join cost. A listening test (3 sets of 25 sentence pairs, 30 listeners) then showed our articulatory cost is preferred at a rate of 58% compared to the standard Multisyn acoustic join cost.
Data Citation
Richmond, Korin; King, Simon. (2016). Listening test materials for "Smooth Talking: Articulatory Join Costs for Unit Selection", [dataset]. http://dx.doi.org/10.7488/ds/1315.
Date made available | 19 Jan 2016 |
---|---|
Publisher | Edinburgh DataShare |