On Memorization in Probabilistic Deep Generative Models

Gerrit J.J. van Den Burg, Christopher K I Williams

Research output: Chapter in Book/Report/Conference proceedingConference contribution


Recent advances in deep generative models have led to impressive results in a variety of application domains. Motivated by the possibility that deep learning models might memorize part of the input data, there have been increased efforts to understand how memorization arises. In this work, we extend a recently proposed measure of memorization for supervised learning (Feldman, 2019) to the unsupervised density estimation problem and adapt it to be more computationally efficient. Next, we present a study that demonstrates how memorization can occur in probabilistic deep generative models such as variational autoencoders. This reveals that the form of memorization to which these models are susceptible differs fundamentally from mode collapse and overfitting. Furthermore, we show that the proposed memorization score measures a phenomenon that is not captured by commonly-used nearest neighbor tests. Finally, we discuss several strategies that can be used to limit memorization in practice. Our work thus provides a framework for understanding problematic memorization in probabilistic generative models.
Original languageEnglish
Title of host publicationAdvances in Neural Information Processing Systems 34 proceedings (NeurIPS 2021)
PublisherNeural Information Processing Systems
Number of pages18
Publication statusPublished - 6 Dec 2021
Event35th Conference on Neural Information Processing Systems - Virtual
Duration: 6 Dec 202114 Dec 2021

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258


Conference35th Conference on Neural Information Processing Systems
Abbreviated titleNeurIPS 2021
Internet address


Dive into the research topics of 'On Memorization in Probabilistic Deep Generative Models'. Together they form a unique fingerprint.

Cite this