Integrating Mechanisms of Visual Guidance in Naturalistic Language Production

Moreno Coco, Frank Keller

Research output: Contribution to journalArticlepeer-review


Situated language production requires the integration of visual attention and linguistic processing. Previous work has not conclusively disentangled the role of perceptual scene information and structural sentence information in guiding visual attention. In this paper, we present an eye-tracking study that demonstrates that three types of guidance, perceptual, conceptual, and structural, interact to control visual attention. In a cued language production experiment, we manipulate perceptual (scene clutter) and conceptual guidance (cue animacy), and measure structural guidance (syntactic complexity of the utterance). Analysis of the time course of language production, before and during speech, reveals that all three forms of guidance affect the complexity of visual responses, quantified in terms of the entropy of attentional landscapes and the turbulence of scan patterns, especially during speech. We find that perceptual and conceptual guidance mediate the distribution of attention in the scene, whereas structural guidance closely relates to scan-pattern complexity. Furthermore, the eye-voice span of the cued object and its perceptual competitor are similar; its latency mediated by both perceptual and structural guidance. These results rule out a strict interpretation of structural guidance as the single dominant form of visual guidance in situated language production. Rather, the phase of the task and the associated demands of cross-modal cognitive processing determine the mechanisms that guide attention.
Keywords: Eye-movements; language production; scene understanding; crossmodal processing; eye-voice span; structural guidance.
Original languageEnglish
Pages (from-to)131-150
Number of pages19
JournalCognitive Processing
Issue number2
Early online date23 Nov 2014
Publication statusPublished - May 2015


Dive into the research topics of 'Integrating Mechanisms of Visual Guidance in Naturalistic Language Production'. Together they form a unique fingerprint.

Cite this