Polygenic risk prediction: why and when out-of-sample prediction R2 can exceed SNP-based heritability.

Xiaotong Wang, Alicia Walker, Joana A Revez, Guiyan Ni, Mark J Adams, Andrew M McIntosh, Major Depressive Disorder Working Group of the Psychiatric Genomics Consortium

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

In polygenic score (PGS) analysis, the coefficient of determination (R 2) is a key statistic to evaluate efficacy. R 2 is the proportion of phenotypic variance explained by the PGS, calculated in a cohort that is independent of the genome-wide association study (GWAS) that provided estimates of allelic effect sizes. The SNP-based heritability (h SNP 2, the proportion of total phenotypic variances attributable to all common SNPs) is the theoretical upper limit of the out-of-sample prediction R 2. However, in real data analyses R 2 has been reported to exceed h SNP 2, which occurs in parallel with the observation that h SNP 2 estimates tend to decline as the number of cohorts being meta-analyzed increases. Here, we quantify why and when these observations are expected. Using theory and simulation, we show that if heterogeneities in cohort-specific h SNP 2 exist, or if genetic correlations between cohorts are less than one, h SNP 2 estimates can decrease as the number of cohorts being meta-analyzed increases. We derive conditions when the out-of-sample prediction R 2 will be greater than h SNP 2 and show the validity of our derivations with real data from a binary trait (major depression) and a continuous trait (educational attainment). Our research calls for a better approach to integrating information from multiple cohorts to address issues of between-cohort heterogeneity.

Original languageEnglish
Pages (from-to)1207-1215
Number of pages9
JournalAmerican Journal of Human Genetics
Volume110
Issue number7
DOIs
Publication statusPublished - 6 Jul 2023

Keywords / Materials (for Non-textual outputs)

  • Humans
  • Genome-Wide Association Study
  • Polymorphism, Single Nucleotide/genetics
  • Multifactorial Inheritance/genetics
  • Phenotype
  • Computer Simulation

Fingerprint

Dive into the research topics of 'Polygenic risk prediction: why and when out-of-sample prediction R2 can exceed SNP-based heritability.'. Together they form a unique fingerprint.

Cite this