Intra-genome variability in the dinucleotide composition of SARS-CoV-2

Research output: Contribution to journalArticlepeer-review


CpG dinucleotides are under-represented in the genomes of single stranded RNA viruses, and SARS-CoV-2 is no exception to this. Artificial modification of CpG frequency is a valid approach for live attenuated vaccine development; if this is to be applied to SARS-CoV-2, we must first understand the role CpG motifs play in regulating SARS-CoV-2 replication. Accordingly, the CpG composition of the SARS-CoV-2 genome was characterised. CpG suppression amongst coronaviruses does not differ between virus genera, but does vary with host species and primary replication site (a proxy for tissue tropism), supporting the hypothesis that viral CpG content may influence cross-species transmission. Although SARS-CoV-2 exhibits overall strong CpG suppression, this varies considerably across the genome, and the Envelope (E) open reading frame (ORF) and ORF10 demonstrate an absence of CpG suppression. Across the Coronaviridae, E genes display remarkably high variation in CpG composition, with those of SARS and SARS-CoV-2 having much higher CpG content than other coronaviruses isolated from humans. This is an ancestrally-derived trait reflecting their bat origins. Conservation of CpG motifs in these regions suggests that they have a functionality which over-rides the need to suppress CpG; an observation relevant to future strategies towards a rationally attenuated SARS-CoV-2 vaccine.
Original languageEnglish
JournalVirus Evolution
Early online date13 Aug 2020
Publication statusE-pub ahead of print - 13 Aug 2020


Dive into the research topics of 'Intra-genome variability in the dinucleotide composition of SARS-CoV-2'. Together they form a unique fingerprint.

Cite this