Edinburgh Research Explorer

Practical Data Synthesis for Large Samples

Research output: Contribution to journalArticle

Related Edinburgh Organisations

Open Access permissions

Open

Documents

  • Download as Adobe PDF

    Rights statement: © 2016-2017 by the authors

    Final published version, 387 KB, PDF document

http://repository.cmu.edu/jpc/vol7/iss3/4/
Original languageEnglish
Article number4
JournalJournal of Privacy and Confidentiality
Volume7
Issue number3
Publication statusPublished - 31 May 2017

Abstract

We describe results on the creation and use of synthetic data that were derived in the context of a project to make synthetic extracts available for users of the UK Longitudinal Studies. A critical review of existing methods of inference from large synthetic data sets is presented. We introduce new variance estimates for use with large samples of completely synthesised data that do not require them to be generated from the posterior predictive distribution derived from the observed data and can be used with a single synthetic data set. We make recommendations on how to synthesise data based on these results. The practical consequences of these results are illustrated with an example from the Scottish Longitudinal Study.

Download statistics

No data available

ID: 37016097