Full genome sequencing of organisms with large and complex genomes is intractable and cost ineffective under most research budgets. Cycads (Cycadales) represent one of the oldest lineages of the extant seed plants and, partly due to their age, have incredibly large genomes up to ~60Gbp. Restriction site associated DNA sequencing (RADseq) offers an approach to find genome-wide informative markers and has proven to be effective with both model and non-model organisms. We tested the application of RADseq using ezRAD across all ten genera of the Cycadales including an example dataset of Cycas calcicola representing 72 samples from natural populations. Using previously available plastid and mitochondrial genomes as references, reads were mapped recovering plastid and mitochondrial genome regions and nuclear markers for all of the genera. De novo assembly generated up to 138,407 high-depth clusters and up to 1,705 phylogenetically informative loci for the genera, and 4,421 loci for the example assembly of C. calcicola. The number of loci recovered by de novo assembly were lower than previous RADseq studies, yet still sufficient for downstream analysis. However, the number of markers could be increased by relaxing our assembly parameters, especially for the C. calcicola dataset. Our results demonstrate the successful application of RADseq across the Cycadales to generate a large number of markers for all genomic compartments, despite the large number of plastids present in a typical plant cell. Our modified protocol was adapted to be applied to cycads and other organisms with large genomes to yield many informative genome-wide markers. This article is protected by copyright. All rights reserved.
- illumina sequencing
- large genomes