Projects per year
Abstract
Analysis of genomic data requires an efficient way to calculate likelihoods across very large numbers of loci. We describe a general method for finding the distribution of genealogies: we allow migration between demes, splitting of demes [as in the isolationwithmigration (IM) model], and recombination between linked loci. These processes are described by a set of linear recursions for the generating function of branch lengths. Under the infinitesites model, the probability of any configuration of mutations can be found by differentiating this generating function. Such calculations are feasible for small numbers of sampled genomes: as an example, we show how the generating function can be derived explicitly for three genes under the twodeme IM model. This derivation is done automatically, using Mathematica. Given data from a large number of unlinked and nonrecombining blocks of sequence, these results can be used to find maximumlikelihood estimates of model parameters by tabulating the probabilities of all relevant mutational configurations and then multiplying across loci. The feasibility of the method is demonstrated by applying it to simulated data and to a data set previously analyzed by Wang and Hey (2010) consisting of 26,141 loci sampled from Drosophila simulans and D. melanogaster. Our results suggest that such likelihood calculations are scalable to genomic data as long as the numbers of sampled individuals and mutations per sequence block are small.
Original language  English 

Pages (fromto)  977987 
Number of pages  42 
Journal  Genetics 
Volume  189 
Issue number  3 
DOIs  
Publication status  Published  Nov 2011 
Projects
 1 Finished

DECIPHERING THE GENOMIC SIGNATURES OF SPECIATION AND INTROGRESSION
1/09/11 → 31/12/14
Project: Research