TY - JOUR
T1 - Inference of identity by descent in population isolates and optimal sequencing studies
AU - Glodzik, Dominik
AU - Navarro, Pau
AU - Vitart, Veronique
AU - Hayward, Caroline
AU - McQuillan, Ruth
AU - Wild, Sarah H
AU - Dunlop, Malcolm G
AU - Rudan, Igor
AU - Campbell, Harry
AU - Haley, Chris
AU - Wright, Alan F
AU - Wilson, James F
AU - McKeigue, Paul
PY - 2013/10
Y1 - 2013/10
N2 - In an isolated population, individuals are likely to share large genetic regions inherited from common ancestors. Identity by descent (IBD) can be inferred from SNP genotypes, which is useful in a number of applications, including identifying genetic variants influencing complex disease risk, and planning efficient cohort-sequencing strategies. We present ANCHAP - a method for detecting IBD in isolated populations. We compare accuracy of the method against other long-range and local phasing methods, using parent-offspring trios. In our experiments, we show that ANCHAP performs similarly as the other long-range method, but requires an order-of-magnitude less computational resources. A local phasing model is able to achieve similar sensitivity, but only at the cost of higher false discovery rates. In some regions of the genome, the studied individuals share haplotypes particularly often, which hints at the history of the populations studied. We demonstrate the method using SNP genotypes from three isolated island populations, as well as in a cohort of unrelated individuals. In samples from three isolated populations of around 1000 individual each, an average individual shares a haplotype at a genetic locus with 9-12 other individuals, compared with only 1 individual within the non-isolated population. We describe an application of ANCHAP to optimally choose samples in resequencing studies. We find that with sample sizes of 1000 individuals from an isolated population genotyped using a dense SNP array, and with 20% of these individuals sequenced, 65% of sequences of the unsequenced subjects can be partially inferred.
AB - In an isolated population, individuals are likely to share large genetic regions inherited from common ancestors. Identity by descent (IBD) can be inferred from SNP genotypes, which is useful in a number of applications, including identifying genetic variants influencing complex disease risk, and planning efficient cohort-sequencing strategies. We present ANCHAP - a method for detecting IBD in isolated populations. We compare accuracy of the method against other long-range and local phasing methods, using parent-offspring trios. In our experiments, we show that ANCHAP performs similarly as the other long-range method, but requires an order-of-magnitude less computational resources. A local phasing model is able to achieve similar sensitivity, but only at the cost of higher false discovery rates. In some regions of the genome, the studied individuals share haplotypes particularly often, which hints at the history of the populations studied. We demonstrate the method using SNP genotypes from three isolated island populations, as well as in a cohort of unrelated individuals. In samples from three isolated populations of around 1000 individual each, an average individual shares a haplotype at a genetic locus with 9-12 other individuals, compared with only 1 individual within the non-isolated population. We describe an application of ANCHAP to optimally choose samples in resequencing studies. We find that with sample sizes of 1000 individuals from an isolated population genotyped using a dense SNP array, and with 20% of these individuals sequenced, 65% of sequences of the unsequenced subjects can be partially inferred.
U2 - 10.1038/ejhg.2012.307
DO - 10.1038/ejhg.2012.307
M3 - Article
C2 - 23361219
VL - 21
SP - 1140
EP - 1145
JO - European Journal of Human Genetics
JF - European Journal of Human Genetics
IS - 10
ER -