Dataset Condensation with Differentiable Siamese Augmentation

Bo Zhao, Hakan Bilen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

n many machine learning problems, large-scale datasets have become the de-facto standard to train state-of-the-art deep networks at the price of heavy computation load. In this paper, we focus on condensing large training sets into significantly smaller synthetic sets which can be used to train deep neural networks from scratch with minimum drop in performance. Inspired from the recent training set synthesis methods, we propose Differentiable Siamese Augmentation that enables effective use of data augmentation to synthesize more informative synthetic images and thus achieves better performance when training networks with augmentations. Experiments on multiple image classification benchmarks demonstrate that the proposed method obtains substantial gains over the state-of-the-art, 7% improvements on CIFAR10 and CIFAR100 datasets. We show with only less than 1% data that our method achieves 99.6%, 94.9%, 88.5%, 71.5% relative performance on MNIST, FashionMNIST, SVHN, CIFAR10 respectively. We also explore the use of our method in continual learning and neural architecture search, and show promising results.
Original languageEnglish
Title of host publicationProceedings of the 38th International Conference on Machine Learning
PublisherPMLR
Pages12674-12685
Publication statusPublished - 18 Jul 2021
EventThirty-eighth International Conference on Machine Learning - Online
Duration: 18 Jul 202124 Jul 2021
https://icml.cc/

Publication series

NameProceedings of Machine Learning Research
PublisherPMLR
Volume139
ISSN (Electronic)2640-3498

Conference

ConferenceThirty-eighth International Conference on Machine Learning
Abbreviated titleICML 2021
Period18/07/2124/07/21
Internet address

Fingerprint

Dive into the research topics of 'Dataset Condensation with Differentiable Siamese Augmentation'. Together they form a unique fingerprint.

Cite this