Phasing and imputation of single nucleotide polymorphism data of missing parents of bi-parental plant populations

Serap Gonen, Valentin Wimmer, Chris Gaynor, Ed Byrne, Gregor Gorjanc, John Hickey

Research output: Contribution to journalArticlepeer-review

Abstract

This paper presents an extension to a heuristic method for phasing and imputation of genotypes of descendants in bi-parental populations so that it can phase and impute genotypes of parents that are ungenotyped or partially genotyped. The imputed genotypes of the parent are used to impute low-density (LD) genotyped descendants to high-density (HD). The extension was implemented as part of the AlphaPlantImpute software, and works in three steps. First, it identifies whether a parent has no or LD genotypes and identifies its relatives that have HD genotypes. Second, using the HD genotypes of relatives, it determines whether the parent is homozygous or heterozygous for a given locus. Third, it phases heterozygous positions of the parent by matching haplotypes to its relatives. We measured the accuracy (correlation between true and imputed genotypes) of imputing parent genotypes in simulated bi-parental populations from different scenarios. We tested the imputation accuracy of the missing parent’s descendants using the true genotype of the parent and compared this to using the imputed genotypes of the parent. Across all scenarios, the imputation accuracy of a parent was >0.98 and did not drop below ~0.96. The imputation accuracy of a parent was always higher when it was inbred than outbred. Including ancestors of the parent at HD, increasing the number of crosses and the number of HD descendants increased the imputation accuracy. The high imputation accuracy achieved for the parent translated to little or no impact on the imputation accuracy of its descendants
Original languageEnglish
JournalCrop science
Early online date23 Nov 2020
DOIs
Publication statusE-pub ahead of print - 23 Nov 2020

Fingerprint

Dive into the research topics of 'Phasing and imputation of single nucleotide polymorphism data of missing parents of bi-parental plant populations'. Together they form a unique fingerprint.

Cite this