Projects per year
Despite the fundamental importance of single nucleotide polymorphisms (SNPs) to human evolution there are still large gaps in our understanding of the forces that shape their distribution across the genome. SNPs have been shown to not be distributed evenly, with directly adjacent SNPs found unusually frequently. Why this is the case is unclear. We illustrate how neighbouring SNPs that can’t be explained by a single mutation event (that we term here sequential dinucleotide mutations, SDMs) are driven by distinct processes to SNPs and multinucleotide polymorphisms (MNPs). By studying variation across populations, including a novel cohort of 1,358 Scottish genomes, we show that, SDMs are over twice as common as MNPs and like SNPs, display distinct mutational spectra across populations. These biases are not only different to those observed among SNPs and MNPs, but also more divergent between human population groups. We show that the changes that make up SDMs are not independent, and identify a distinct mutational profile, CA → CG → TG, that is observed an order of magnitude more often than expected from background SNP rates and the numbers of other SDMs involving the gain and deamination of CpG sites. Intriguingly particular pathways through the amino acid code appear to have been favoured relative to that expected from intergenic SDM rates and the occurrences of coding SNPs, and in particular those that lead to the creation of single codon amino acids. We finally present evidence that epistatic selection has potentially disfavoured sequential non-synonymous changes in the human genome.
- sequential dinucleotide mutations,
- multi-nucleotide polymorphisms
- multi-nucleotide mutations
- human mutation
- DNA repair
- epistatic selection