In this paper, we investigate the acoustic prosodic marking of demonstrative and personal pronouns in task-oriented dialog. Although it has been hypothesized that acoustic marking affects pronoun resolution, we find that the prosodic information extracted from the data is not sufficient to predict antecedent type reliably. Inter-speaker variation accounts for much of the prosodic variation that we find in our data. We conclude that prosodic cues should be handled with care in robust, speaker-independent dialog systems.
|Publisher||Association for Computational Linguistics|