Unsupervised learning of object frames by dense equivariant image labelling

James Thewlis, Hakan Bilen, Andrea Vedaldi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract / Description of output

One of the key challenges of visual perception is to extract abstract models of 3D objects and object categories from visual measurements, which are affected by complex nuisance factors such as viewpoint, occlusion, motion, and deformations. Starting from the recent idea of viewpoint factorization, we propose a new approach that, given a large number of images of an object and no other supervision, can extract a dense object-centric coordinate frame. This coordinate frame is invariant to deformations of the images and comes with a dense equivariant labelling neural network that can map image pixels to their corresponding object coordinates. We demonstrate the applicability of this method to simple articulated objects and deformable objects such as human faces, learning embeddings from random synthetic transformations or optical flow correspondences, all without any manual supervision.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 30 (NIPS 2017)
Place of PublicationCalifornia, United States
PublisherNeural Information Processing Systems Foundation, Inc
Number of pages12
Publication statusPublished - 9 Dec 2017
EventNIPS 2017: 31st Conference on Neural Information Processing Systems - Long Beach, California, United States
Duration: 4 Dec 20179 Dec 2017

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Electronic)1049-5258


ConferenceNIPS 2017
Abbreviated titleNIPS 2017
Country/TerritoryUnited States
Internet address


Dive into the research topics of 'Unsupervised learning of object frames by dense equivariant image labelling'. Together they form a unique fingerprint.

Cite this