Abstract / Description of output
'Dated-tip' methods of molecular dating use DNA sequences sampled at different times, to estimate the age of their most recent common ancestor. Several tests of 'temporal signal' are available to determine whether data sets are suitable for such analysis. However, it remains unclear whether these tests are reliable. We investigate the performance of several tests of temporal signal, including some recently suggested modifications. We use simulated data (where the true evolutionary history is known), and whole genomes of methicillin-resistant Staphylococcus aureus (to show how particular problems arise with real-world data sets). We show that all of the standard tests of temporal signal are seriously misleading for data where temporal and genetic structures are confounded (i.e. where closely related sequences are more likely to have been sampled at similar times). This is not an artefact of genetic structure or tree shape per se, and can arise even when sequences have measurably evolved during the sampling period. More positively, we show that a 'clustered permutation' approach introduced by Duchêne et al. (Molecular Biology and Evolution, 32, 2015, 1895) can successfully correct for this artefact in all cases and introduce techniques for implementing this method with real data sets. The confounding of temporal and genetic structures may be difficult to avoid in practice, particularly for outbreaks of infectious disease, or when using ancient DNA. Therefore, we recommend the use of 'clustered permutation' for all analyses. The failure of the standard tests may explain why different methods of dating pathogen origins have reached such wildly different conclusions.
Keywords / Materials (for Non-textual outputs)
- Bayesian dating
- Pathogen origins
- Permutation tests
- Staphylococcus aureus