TY - JOUR

T1 - A Comparison of Models to Infer the Distribution of Fitness Effects of New Mutations

AU - Kousathanas, Athanasios

AU - Keightley, Peter D

PY - 2013/4/1

Y1 - 2013/4/1

N2 - Knowing the distribution of fitness effects (DFE) of new mutations is important for several topics in evolutionary genetics. Existing computational methods to infer the DFE from DNA polymorphism data have frequently assumed that the DFE can be approximated by a unimodal distribution, such as a lognormal or a gamma distribution. However, if the true DFE departs substantially from the assumed distribution (e.g., if the DFE is multimodal), this could lead to misleading inferences about its properties. We conducted simulations to test the performance of parametric and non-parametric discretised distribution models to infer the properties of the DFE for cases in which the true DFE is unimodal, bimodal or multimodal. We found that lognormal and gamma distribution models can perform poorly in recovering the properties of the distribution if the true DFE is bimodal or multimodal, whereas the discretised distribution models provide a better fit. If there is a sufficient amount of data, the discretised models can be used to detect multimodality of the DFE and to accurately infer the mean effect and the average fixation probability of a new deleterious mutation. We fitted several models for the DFE of amino acid-changing mutations using whole-genome data from Drosophila melanogaster and the house mouse subspecies Mus musculus castaneus. A lognormal DFE best explains the data for D. melanogaster, whereas we find evidence for a bimodal DFE in M. m. castaneus.

AB - Knowing the distribution of fitness effects (DFE) of new mutations is important for several topics in evolutionary genetics. Existing computational methods to infer the DFE from DNA polymorphism data have frequently assumed that the DFE can be approximated by a unimodal distribution, such as a lognormal or a gamma distribution. However, if the true DFE departs substantially from the assumed distribution (e.g., if the DFE is multimodal), this could lead to misleading inferences about its properties. We conducted simulations to test the performance of parametric and non-parametric discretised distribution models to infer the properties of the DFE for cases in which the true DFE is unimodal, bimodal or multimodal. We found that lognormal and gamma distribution models can perform poorly in recovering the properties of the distribution if the true DFE is bimodal or multimodal, whereas the discretised distribution models provide a better fit. If there is a sufficient amount of data, the discretised models can be used to detect multimodality of the DFE and to accurately infer the mean effect and the average fixation probability of a new deleterious mutation. We fitted several models for the DFE of amino acid-changing mutations using whole-genome data from Drosophila melanogaster and the house mouse subspecies Mus musculus castaneus. A lognormal DFE best explains the data for D. melanogaster, whereas we find evidence for a bimodal DFE in M. m. castaneus.

U2 - 10.1534/genetics.112.148023

DO - 10.1534/genetics.112.148023

M3 - Article

C2 - 23341416

VL - 193

SP - 1197

EP - 1208

JO - Genetics

JF - Genetics

SN - 0016-6731

IS - 4

ER -