Data from: RAD-sequencing for estimating GRM-based heritability in the wild: a case study in roe deer

  • Lisa Gervais (Creator)
  • C Perrier (Creator)
  • Manon Bernard (Creator)
  • J Merlet (Creator)
  • Josephine Pemberton (Creator)
  • Benoit Pujol (Creator)
  • Erwan Quemere (Université de Toulouse) (Creator)

Dataset

Description

*NCBI BioProject PRJNA533008: Provide all FASTQ data already demultiplexed and preprocessed.

*sensitivity_analysis.R: R script with explanations to run sensisitivity analysis using ASREML-R v3.00

*phenotype.csv: A csv file containing individual phenotypic informations. ANIMAL is the name of the individuals. BODYMASS is their bodymass in kg. AGE and SEX correspond to class-age and sex of individuals respectively.

*grm_uniq_hwe0.05_maxmissXX_mafXX.grm: Genomic Relatedness Matrix built from the SNPs called using the S3 'Intermediate' set of parameters.

*full_indiv_barcorde.txt(not necessary for denovo_map.pl): A file with describing individuals in 4 columns.

*population.map: A file containing two columns with assignments of each of the samples to a particular population.

*stacksoutput_maxloc.vcf: Output file of *populations* program of Stacks software, run using the S1 MaxLoci set of parameters: -m=2 -M=2 -N=4 --max_locus_stacks=5 -n=1

*stacksoutput_minerr.vcf: Output file of *populations* program of Stacks software, run using the S2 MinError set of parameters: -m=11 -M=2 -N=4 --max_locus_stacks=5 -n=1

*stacksoutput_default.vcf: Output file of *populations* program of Stacks software, run using the S4 Default set of parameters: -m=3 -M=2 -N=4 --max_locus_stacks=3 -n=1

*stacksoutput_intermediate.vcf: Output file of *populations* program of Stacks software, run using the S3 Intermediate set of parameters: -m=7 -M=2 -N=4 --max_locus_stacks=5 -n=1

*indiv_to_rm.txt: Empty text file to fill with individual names needed to be remove for future analysis(e.g. individuals with too many missing data, replicate
samples).

*script_QCfiltering.sh: A shell script with explanations to run the different quality filtering procedures using VCF file as input. Please open it with Notepad.

Abstract

Estimating the evolutionary potential of quantitative traits and reliably predicting responses to selection in wild populations are important challenges in evolutionary biology. The genomic revolution has opened up opportunities for measuring relatedness among individuals with precision, enabling pedigree-free estimation of trait heritabilities in wild populations. However, until now, most quantitative genetic studies based on a genomic relatedness matrix (GRM) have focused on long-term monitored populations for which traditional pedigrees were also available, and have often had access to knowledge of genome sequence and variability. Here, we investigated the potential of RAD-sequencing for estimating heritability in a free-ranging roe deer population for which no prior genomic resources were available. We propose a step-by-step analytical framework to optimize the quality and quantity of the genomic data and explore the impact of the SNP calling and filtering processes on the GRM structure and GRM-based heritability estimates. As expected, our results show that sequence coverage strongly affects the number of recovered loci, the genotyping error rate and the amount of missing data. Ultimately, this had little effect on heritability estimates and their standard errors, provided that the GRM was built from a minimum number of loci (above 7000). GRM-based heritability estimates thus appear robust to a moderate level of genotyping errors in the SNP dataset. We also showed that quality filters, such as the removal of low-frequency variants, affect the relatedness structure of the GRM, generating lower h² estimates. Our work illustrates the huge potential of RAD-sequencing for estimating GRM-based heritability in virtually any natural population.

Data Citation

When using this data, please cite the original publication:

Citation is not yet available for this publication from Molecular Ecology Resources. It will become available shortly after the publication appears.
Additionally, please cite the Dryad data package:

Gervais L, Perrier C, Bernard M, Merlet J, Pemberton J, Pujol B, Quemere E (2019) Data from: RAD-sequencing for estimating GRM-based heritability in the wild: a case study in roe deer. Dryad Digital Repository. https://doi.org/10.5061/dryad.n6d4mm5
Date made available1 May 2019
PublisherDryad

Cite this