Topic Models for Image Annotation and Text Illustration

Yansong Feng, Mirella Lapata

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Image annotation, the task of automatically generating description words for a picture, is a key component in various image search and retrieval applications. Creating image databases for model development is, however, costly and time consuming, since the key-words must be hand-coded and the process repeated for new collections. In this work we exploit the vast resource of images and documents available on the web for developing image annotation models without any human involvement. We describe a probabilistic model based on the assumption that images and their co-occurring textual data are generated by mixtures of latent topics. We show that this model outperforms previously proposed approaches when applied to image annotation and the related task of text illustration despite the noisy nature of our dataset.
Original languageEnglish
Title of host publicationHuman Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL
PublisherAssociation for Computational Linguistics
Pages831-839
Number of pages9
Publication statusPublished - 2010

Fingerprint

Dive into the research topics of 'Topic Models for Image Annotation and Text Illustration'. Together they form a unique fingerprint.

Cite this