Projects per year
The top-down guidance of visual attention is one of the main factors allowing humans to effectively process vast amounts of incoming visual information. Nevertheless we still lack a full understanding of the visual, semantic, and memory processes governing visual attention. In this paper, we present a computational model of visual search capable of predicting the most likely positions of target objects. The model does not require a separate training phase, but learns likely target positions in an incremental fashion based on a memory of previous fixations. We evaluate the model on two search tasks and show that it outperforms saliency alone and comes close to the maximal performance of the Contextual Guidance Model (CGM; Torralba, Oliva, Castelhano, & Henderson, 2006; Ehinger, Hidalgo-Sotelo, Torralba, & Oliva, 2009), even though our model does not perform scene recognition or compute global image statistics. The search performance of our model can be further improved by combining it with the CGM.