Learning object class detectors from weakly annotated video

A. Prest, C. Leistner, J. Civera, C. Schmid, V. Ferrari

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

Object detectors are typically trained on a large set of still images annotated by bounding-boxes. This paper introduces an approach for learning object detectors from real-world web videos known only to contain objects of a target class. We propose a fully automatic pipeline that localizes objects in a set of videos of the class and learns a detector for it. The approach extracts candidate spatio-temporal tubes based on motion segmentation and then selects one tube per video jointly over all videos. To compare to the state of the art, we test our detector on still images, i.e., Pascal VOC 2007. We observe that frames extracted from web videos can differ significantly in terms of quality to still images taken by a good camera. Thus, we formulate the learning from videos as a domain adaptation task. We show that training from a combination of weakly annotated videos and fully annotated still images using domain adaptation improves the performance of a detector trained from still images alone.
Original languageEnglish
Title of host publicationComputer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages8
ISBN (Print)978-1-4673-1226-4
Publication statusPublished - 1 Jun 2012

Keywords / Materials (for Non-textual outputs)

  • Internet
  • image motion analysis
  • image segmentation
  • image sensors
  • learning (artificial intelligence)
  • object detection
  • video signal processing
  • Pascal VOC 2007
  • bounding-boxes
  • camera
  • candidate spatio-temporal tubes
  • domain adaptation
  • extracted frames
  • fully annotated still images
  • fully automatic pipeline
  • motion segmentation
  • object class detectors learning
  • real-world Web videos
  • still images
  • weakly annotated video
  • weakly annotated videos
  • Detectors
  • Electron tubes
  • Hidden Markov models
  • Image segmentation
  • Motion segmentation
  • Tracking
  • Training


Dive into the research topics of 'Learning object class detectors from weakly annotated video'. Together they form a unique fingerprint.

Cite this