Projects per year
CpG dinucleotides are under-represented in the genomes of single stranded RNA viruses, and SARS-CoV-2 is no exception to this. Artificial modification of CpG frequency is a valid approach for live attenuated vaccine development; if this is to be applied to SARS-CoV-2, we must first understand the role CpG motifs play in regulating SARS-CoV-2 replication. Accordingly, the CpG composition of the SARS-CoV-2 genome was characterised. CpG suppression amongst coronaviruses does not differ between virus genera, but does vary with host species and primary replication site (a proxy for tissue tropism), supporting the hypothesis that viral CpG content may influence cross-species transmission. Although SARS-CoV-2 exhibits overall strong CpG suppression, this varies considerably across the genome, and the Envelope (E) open reading frame (ORF) and ORF10 demonstrate an absence of CpG suppression. Across the Coronaviridae, E genes display remarkably high variation in CpG composition, with those of SARS and SARS-CoV-2 having much higher CpG content than other coronaviruses isolated from humans. This is an ancestrally-derived trait reflecting their bat origins. Conservation of CpG motifs in these regions suggests that they have a functionality which over-rides the need to suppress CpG; an observation relevant to future strategies towards a rationally attenuated SARS-CoV-2 vaccine.