Combining Image-Level and Segment-Level Models for Automatic Annotation

Daniel Kuettel, Matthieu Guillaumin, Vittorio Ferrari

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

For the task of assigning labels to an image to summarize its contents, many early attempts use segment-level information and try to determine which parts of the images correspond to which labels. Best performing methods use global image similarity and nearest neighbor techniques to transfer labels from training images to test images. However, global methods cannot localize the labels in the images, unlike segment-level methods. Also, they cannot take advantage of training images that are only locally similar to a test image. We propose several ways to combine recent image-level and segment-level techniques to predict both image and segment labels jointly. We cast our experimental study in an unified framework for both image-level and segment-level annotation tasks. On three challenging datasets, our joint prediction of image and segment labels outperforms either prediction alone on both tasks. This confirms that the two levels offer complementary information.
Original languageEnglish
Title of host publicationAdvances in Multimedia Modeling
Subtitle of host publication18th International Conference, MMM 2012, Klagenfurt, Austria, January 4-6, 2012. Proceedings
EditorsKlaus Schoeffmann, Bernard Merialdo, AlexanderG. Hauptmann, Chong-Wah Ngo, Yiannis Andreopoulos, Christian Breiteneder
PublisherSpringer Berlin Heidelberg
Pages16-28
Number of pages13
ISBN (Electronic)978-3-642-27355-1
ISBN (Print)978-3-642-27354-4
DOIs
Publication statusPublished - 2012

Publication series

NameLecture Notes in Computer Science
PublisherSpringer Berlin Heidelberg
Volume7131
ISSN (Print)0302-9743

Keywords

  • image auto-annotation
  • image region labelling
  • keyword-based image retrieval

Fingerprint

Dive into the research topics of 'Combining Image-Level and Segment-Level Models for Automatic Annotation'. Together they form a unique fingerprint.

Cite this