A Conditional Deep Generative Model of People in Natural Images

Rodrigo de Bem, Arnab Ghosh, Adnane Boukhayma, T. Ajanthan, N. Siddharth, Philip Torr

Research output: Chapter in Book/Report/Conference proceedingConference contribution


We propose a deep generative model of humans in natural images which keeps 2D pose separated from other latent factors of variation, such as background scene and clothing. In contrast to methods that learn generative models of low-dimensional representations, e.g., segmentation masks and 2D skeletons, our single-stage end-to-end conditional-VAEGAN learns directly on the image space. The flexibility of this approach allows the sampling of people with independent variations of pose and appearance. Moreover, it enables the reconstruction of images conditioned to a given posture, allowing, for instance, pose-transfer from one person to another. We validate our method on the Human3.6M dataset and achieve state-of-the-art results on the ChictopiaPlus benchmark. Our model, named Conditional-DGPose, outperforms the closest related work in the literature. It generates more realistic and accurate images regarding both, body posture and image quality, learning the underlying factors of pose and appearance variation.
Original languageEnglish
Title of host publication2019 IEEE Winter Conference on Applications of Computer Vision (WACV)
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages10
ISBN (Electronic)978-1-7281-1975-5
ISBN (Print)978-1-7281-1976-2
Publication statusPublished - 7 Mar 2019

Publication series

ISSN (Print)1550-5790


Dive into the research topics of 'A Conditional Deep Generative Model of People in Natural Images'. Together they form a unique fingerprint.

Cite this