We describe an implementation of datadriven selection of emphatic facial displays for an embodied conversational agent in a dialogue system. A corpus of sentences in the domain of the target dialogue system was recorded, and the facial displays used by the speaker were annotated. The data from those recordings was used in a range of models for generating facial displays, each model making use of a different amount of context or choosing displays differently within a context. The models were evaluated in two ways: by cross-validation against the corpus, and by asking users to rate the output. The predictions of the cross-validation study differed from the actual user ratings. While the cross-validation gave the highest scores to models making a majority choice within a context, the user study showed a significant preference for models that produced more variation. This preference was especially strong among the female subjects.
|Title of host publication||IN PROCEEDINGS OF EACL-2006|
|Publisher||ASSOC COMPUTATIONAL LINGUISTICS-ACL|
|Number of pages||8|
|Publication status||Published - 2006|