Learning to Predict Keypoints and Structure of Articulated Objects without Supervision

Titas Anciukevicius, Paul Henderson, Hakan Bilen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Reasoning about the structure and motion of novel object classes is a core ability in human cognition, crucial for manipulating objects and predicting their possible motion. We present a method that learns to infer the skeleton structure of a novel articulated object from a single image, in terms of joints and rigid links connecting them. The model learns without supervision from a dataset of objects having diverse structures, in different poses and states of articulation. To achieve this, it is trained to explain the differences between pairs of images in terms of a latent skeleton that defines how to transform one into the other. Experiments on several datasets show that our model predicts joint locations significantly more accurately than prior works on unsupervised keypoint discovery; moreover, unlike existing methods, it can predict varying numbers of joints depending on the observed object. It also successfully predicts the connections between joints, even for structures not seen during training.
Original languageEnglish
Title of host publication2022 26th International Conference on Pattern Recognition, ICPR 2022
PublisherInstitute of Electrical and Electronics Engineers
Pages3383-3390
Number of pages8
ISBN (Electronic)9781665490627
ISBN (Print)9781665490634
DOIs
Publication statusPublished - 29 Nov 2022
Event26th International Conference on Pattern Recognition, ICPR 2022 - Montreal, Canada
Duration: 21 Aug 202225 Aug 2022

Publication series

NameProceedings - International Conference on Pattern Recognition
Volume2022-August
ISSN (Print)1051-4651
ISSN (Electronic)2831-7475

Conference

Conference26th International Conference on Pattern Recognition, ICPR 2022
Country/TerritoryCanada
CityMontreal
Period21/08/2225/08/22

Fingerprint

Dive into the research topics of 'Learning to Predict Keypoints and Structure of Articulated Objects without Supervision'. Together they form a unique fingerprint.

Cite this