A Simple Multiallele Model and Its Application to Identifying Preferred-Unpreferred Codons Using Polymorphism Data

Kai Zeng

Research output: Contribution to journalArticlepeer-review

Abstract

Analysis of within-species polymorphism data usually relies on population genetic models that assume two alleles at a locus (e.g., the infinite sites model). However, many problems of interest can be tackled more naturally by multiallele models. In this study, I construct a model that can accommodate an arbitrary number of alleles at a locus, mutational biases, and selective differences between each of the alleles. It is constructed by representing population dynamics by a Markov transition matrix and is based on the assumption that at most two variants exist at each polymorphic site. A likelihood-based method for inferring the selection and mutational parameters of the model is constructed and is shown to have high accuracy. I use this method to jointly infer preferred codons and mutational parameters in Drosophila melanogaster. Twenty-one codons are identified as preferred, 19 of which were found previously by methods that do not use polymorphism data. Interestingly, the selective difference between the fittest and the worst codons encoding the same amino acid is positively correlated with the number of synonymous codons for that amino acid, in agreement with previous analyses of interspecies data using phylogenetic models. The inferred mutation matrix is highly asymmetric, with C -> T and G -> A being the most common and constituting similar to 18% and similar to 19% of all mutation events, respectively. These results suggest that the new model provides a useful framework for analyzing polymorphism data sampled from multiallele systems.

Original languageEnglish
Pages (from-to)1327-1337
Number of pages11
JournalMolecular Biology and Evolution
Volume27
Issue number6
DOIs
Publication statusPublished - Jun 2010

Fingerprint

Dive into the research topics of 'A Simple Multiallele Model and Its Application to Identifying Preferred-Unpreferred Codons Using Polymorphism Data'. Together they form a unique fingerprint.

Cite this