Discovering object aspects from video

Anestis Papazoglou, Luca Del Pero, Vittorio Ferrari

Research output: Contribution to journalArticlepeer-review


We investigate the problem of automatically discovering the visual aspects of an object class. Existing methods discover aspects from still images under strong supervision, as they require time-consuming manual annotation of the objects' location (e.g. bounding boxes). Instead, we explore using video, which enables automatic localisation by motion segmentation. We introduce a new video dataset containing over 10,000 frames annotated with aspect labels for two classes: cars and tigers. We evaluate several strategies for aspect discovery using state-of-the-art descriptors (e.g. CNN), and assess the benefits of using automatic video segmentation. For this, we introduce a new protocol to evaluate aspect discovery directly, in contrast to the general trend of evaluating it indirectly (e.g. its impact on a recognition pipeline). Our results consistently show that leveraging the nature of video to discover visual aspects yields significantly more accuracy. Finally, we discuss two new applications to showcase the potential of aspect discovery: image retrieval of aspects, and learning aspect transitions from video.
Original languageEnglish
Pages (from-to)206 - 217
Number of pages12
JournalImage and vision computing
Early online date4 May 2016
Publication statusPublished - Aug 2016


Dive into the research topics of 'Discovering object aspects from video'. Together they form a unique fingerprint.

Cite this