Limited dimensionality of genomic information and implications on genomic selection

Ivan Pocrnic, Daniela A. L. Lourenco, Gregor Gorjanc, Ignacy Misztal

Research output: Contribution to conferencePoster


The dimensionality of the genomic, single nucleotide polymorphism (SNP), information can be described by the concept of independent chromosome segments. This concept of independent chromosome segments can be traced backed to the Fisher’s theory of junctions and is commonly defined as a function of effective population size (Ne) and genome length (L) in Morgan. The expected number of those segments (Me) was defined by Stam as 4NeL in a fixed size, random mating population, and was modified due to various considerations (e.g., selection, unequal size of segments, etc.). The concept was “rediscovered” with the advent of genomic selection in livestock populations, especially as a crucial parameter in the theoretical formulas for prediction accuracy. When both the number of SNP and genotyped individuals are large, dimensionality can be approximately calculated as the number of non-negligible singular values of gene content, or equally, as the number of non-negligible eigenvalues of genomic relationship matrix (GRM) that explain 98% of the variation. In previous studies on livestock data, Me was estimated to be 10,000-15,000 in cattle (U.S. Holsteins and U.S. Angus) and around 4000 in commercial pigs and broiler chickens. In these studies, the number of eigenvalues that explained 90, 95, and 98% of GRM variation roughly corresponded to NeL, 2NeL, and 4NeL, respectively. To check how using these numbers is influencing the accuracy of prediction in the genomic selection, single-step genomic best linear unbiased prediction (ssGBLUP) with GRM inverted by sparse generalized inverse – APY (algorithm for proven and young) was applied, as APY algorithm is indirectly utilizing the Me. Interestingly, accuracies peaked with the dimensionality corresponding to 98 to 99% of variation or around 4NeL, depending on population, indicating that 1 to 2% of variation in the GRM was due to the noise. However, the accuracies were only slightly reduced at half the optimum dimensionality. These results show that effective genomic information is actually limited and provide us with the new viewpoint on the mechanisms behind the genomic selection. Some of the open questions are impact of limited dimensionality on genome-wide association studies (GWAS), optimal size of SNP chip, theoretical formulas for accuracy of prediction, and is actually genomic selection working in chromosome segments or something else?
Original languageEnglish
Publication statusPublished - 7 Nov 2019
EventPlant Quantitative Genetics: from Theory into Practice - University of Birmingham, Birmingham, United Kingdom
Duration: 7 Nov 2019 → …


ConferencePlant Quantitative Genetics
Country/TerritoryUnited Kingdom
Period7/11/19 → …


Dive into the research topics of 'Limited dimensionality of genomic information and implications on genomic selection'. Together they form a unique fingerprint.

Cite this