The Limits of Individual Identification from Sample Allele Frequencies: Theory and Statistical Analysis

Peter M. Visscher, William G. Hill

Research output: Contribution to journalArticlepeer-review

Abstract

It was shown recently using experimental data that it is possible under certain conditions to determine whether a person with known genotypes at a number of markers was part of a sample from which only allele frequencies are known. Using population genetic and statistical theory, we show that the power of such identification is, approximately, proportional to the number of independent SNPs divided by the size of the sample from which the allele frequencies are available. We quantify the limits of identification and propose likelihood and regression analysis methods for the analysis of data. We show that these methods have similar statistical properties and have more desirable properties, in terms of type-I error rate and statistical power, than test statistics suggested in the literature.

Original languageEnglish
Article numbere1000628
Pages (from-to)-
Number of pages6
JournalPLoS Genetics
Volume5
Issue number10
DOIs
Publication statusPublished - Oct 2009

Cite this