How do humans and machine learning models track multiple objects through occlusion?

Benjamin Peters, Eivinas Butkus, Nikolaus Kriegeskorte

Research output: Contribution to conferencePaperpeer-review

Abstract / Description of output

Interacting with a complex environment often requires us to track multiple task-relevant objects not all of which are continually visible. The cognitive literature has focused on tracking a subset of visible identical abstract objects (e.g., circles), isolating the tracking component from its context in real-world experience. In the real world, object tracking is harder in that objects may not be continually visible and easier in that objects differ in appearance and so their recognition can rely on both remembered position and current appearance. Here we introduce a generalized task that combines tracking and recognition of valued objects that move in complex trajectories and frequently disappear behind occluders. Humans and models (from the computer-vision literature on object tracking) performed tasks varying widely in terms of the number of objects to be tracked, the number of distractors, the presence of an occluder, and the appearance similarity between targets and distractors. We replicated results from the human literature, including a deterioration of tracking performance with the number and similarity of targets and distractors. In addition, we find that increasing levels of occlusion reduce performance. All models tested here behaved in qualitatively different ways from human observers, showing superhuman performance for large numbers of targets, and subhuman performance under conditions of occlusion. Our framework will enable future studies to connect the human behavioral and engineering literatures, so as to test image-computable multiple-object-tracking models as models of human performance and to investigate how tracking and recognition interact under natural conditions of dynamic motion and occlusion.
Original languageEnglish
Pages1-13
Number of pages13
Publication statusPublished - 18 Oct 2022
Event4th Workshop on Shared Visual Representations in Human and Machine Visual Intelligence - New Orleans, United States
Duration: 2 Dec 20222 Dec 2022

Workshop

Workshop4th Workshop on Shared Visual Representations in Human and Machine Visual Intelligence
Abbreviated titleSVRHM 2022
Country/TerritoryUnited States
CityNew Orleans
Period2/12/222/12/22

Fingerprint

Dive into the research topics of 'How do humans and machine learning models track multiple objects through occlusion?'. Together they form a unique fingerprint.

Cite this