Visual complexity and its effects on referring expression generation

Micha Elsner, Alasdair Clarke, Hannah Rohde

Research output: Contribution to journalArticlepeer-review

Abstract

Speakers’ perception of a visual scene influences the language they use to describe it—which objects they choose to mention and how they characterize the relationships between them. We show that visual complexity can either delay or facilitate description generation, depending on how much disambiguating information is required and how useful the scene’s complexity can be in providing, for example, helpful landmarks. To do so, we measure speech onset times, eye gaze, and utterance content in a reference production experiment in which the target object is either unique or non-unique in a visual scene of varying size and complexity. Speakers delay speech onset if the target object is non-unique and requires disambiguation, and we argue that this reflects the cost of deciding on a high-level strategy for describing it. The eye-tracking data demonstrates that these delays increase when the speaker is able to conduct an extensive early visual search, implying that when a speaker scans too little of the scene early on, they may decide to begin speaking before becoming aware that their description is underspecified. Speakers’ content choices reflect the visual makeup of the scene—the number of distractors present and the availability of useful landmarks. Our results highlight the complex role of visual perception in reference production, showing that speakers can make good use of complexity in ways that reflect their visual
processing of the scene.
Original languageEnglish
Pages (from-to)1-34
JournalCognitive Science
Early online date26 Jun 2017
DOIs
Publication statusE-pub ahead of print - 26 Jun 2017

Keywords

  • referring expression generation
  • psycholinguistics
  • sentence processing
  • visual search

Fingerprint

Dive into the research topics of 'Visual complexity and its effects on referring expression generation'. Together they form a unique fingerprint.

Cite this