Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison

Daniel Halligan, Peter Keightley

Research output: Contribution to journalArticlepeer-review

Abstract

Non-coding DNA comprises ∼80% of the euchromatic portion of the Drosophila melanogaster genome. Non-coding sequences are known to contain functionally important elements controlling gene expression, but the proportion of sites that are selectively constrained is still largely unknown. We have compared the complete D. melanogaster and Drosophila simulans genome sequences to estimate mean selective constraint (the fraction of mutations that are eliminated by selection) in coding and non-coding DNA by standardizing to substitution rates in putatively unconstrained sequences. We show that constraint is positively correlated with intronic and intergenic sequence length and is generally remarkably strong in non-coding DNA, implying that more than half of all point mutations in the Drosophila genome are deleterious. This fraction is also likely to be an underestimate if many substitutions in non-coding DNA are adaptively driven to fixation. We also show that substitutions in long introns and intergenic sequences are clustered, such that there is an excess of substitutions <8 bp apart and a deficit farther apart. These results suggest that there are blocks of constrained nucleotides, presumably involved in gene expression control, that are concentrated in long non-coding sequences. Furthermore, we infer that there is more than three times as much functional non-coding DNA as protein-coding DNA in the Drosophila genome. Most deleterious mutations therefore occur in non-coding DNA, and these may make an important contribution to a wide variety of evolutionary processes.
Original languageEnglish
Pages (from-to)875-88
JournalGenome Research
Volume16
Issue number7
DOIs
Publication statusPublished - 2006

Fingerprint

Dive into the research topics of 'Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison'. Together they form a unique fingerprint.

Cite this