Edinburgh Research Explorer

Genomic prediction for complex traits following feature selection: results from Bayes C and genomic best linear unbiased prediction (G-BLUP).

Research output: Contribution to conferencePaperpeer-review

Original languageEnglish
Publication statusPublished - 1 Apr 2014
Event42nd European Mathematical Genetics Meeting (EMGM) - Cologne, Germany
Duration: 1 Apr 20142 Apr 2014


Conference42nd European Mathematical Genetics Meeting (EMGM)


Genome-wide association studies (GWAS) have identified thousands of SNPs associated with health-related traits, and thus provide a source of information about useful predictors for these traits. The best practices in the implementation of genomic prediction approaches using these high-dimensional GWAS data have yet to be determined. One important issue is feature selection (i.e. selection of SNPs exhibiting non-redundant information) which could reduce model complexity and computational requirements. In this study we investigated the effect of supervised feature selection on the performance of two widely used prediction methods: Bayes C and genomic best linear unbiased prediction (G-BLUP). We explored prediction of the complex traits height, high density lipoproteins (HDL) and body mass index (BMI) within 2,186 Croatian and into a replication population of 810 UK individuals (ORCADES). Using all 263,357 markers, Bayes C and G-BLUP had similar prediction accuracy across all traits within the Croatian data, and for the highly polygenic traits height and BMI when predicting into the ORCADES data. Although Bayes C outperformed G-BLUP in the prediction of HDL (which is influenced by fewer quantitative trait loci than BMI and height) into the ORCADES data, it was more than 3000 times slower computationally than G-BLUP. However, the application of supervised feature selection allowed GBLUP to achieve equivalent predictive performance to Bayes C with greatly reduced computational effort. Feature selection in the G-BLUP framework therefore provides a flexible and more efficient alternative to computationally expensive Bayes C for all considered traits in this study.


42nd European Mathematical Genetics Meeting (EMGM)



Event: Conference

ID: 16975253