Machine learning can improve prediction of lifetime major depressive disorder in Generation Scotland: Scottish Family Health Study

Mairead Bermingham, Daniel Urda Munoz, Felix Agakov, Archie Campbell, Caroline Hayward, Ella Wigmore, Jude Gibson, Toni-Kim Clarke, Ana Maria Fernández Pujals, Donald MacIntyre, Andrew McIntosh, Paul McKeigue, David Porteous, Kristin Nicodemus

Research output: Contribution to conferencePosterpeer-review

Abstract / Description of output

Major depressive disorder (MDD) is one of the most common mental illnesses, with a lifetime prevalence of around 15%. Accurate diagnosis is therefore of critical importance in reducing the disease burden. In this study, we evaluated the predictive value of a wide range of family history, clinical, demographic and genomic variables in the classification of MDD outcome in the Generation Scotland: Scottish Family Health Study (GS:SFHS) cohort. To do this, an extensive collection of machine learning (ML) methods were used. After screening, approximately 3,000, of the 21,476 GS:SFHS participants undertook the structured clinical interview from the Diagnostic and Statistical Manual of Mental Disorders to diagnose MDD. Following data editing, 1,130 MDD cases and 5,043 controls were available for inclusion in the analysis. The data was then randomly split into training (63%) and independent test (37%) sets to assess model performance. Ten-fold cross validation was used in the training data to train the models. Classification performance and degree of calibration in the test data were assessed using the area under the receiver operating characteristic curve (AUC) and Hosmer–Lemeshow goodness-of-fit (HLGOF) respectively. Family history, clinical, and demographic variables provided the best predictive ability. The gradient descent boosting algorithm (GBM; closely followed by LASSO) showed the highest performance, demonstrating good discriminatory power and clinical utility (AUC=0.846, 95% confidence interval: 0.845-0.847), and calibrated well (HLGOF 2=6.80, p=0.078) in the test data. These ML methods, particularly GBM, provided robust classification for the diagnosis of lifetime MDD in a large Scottish population-based cohort.
Original languageEnglish
Publication statusPublished - 22 Apr 2016
Event 2nd Scottish Biomedical Postdoctoral Researcher Conference - Appleton Tower, University of Edinburgh , Edinburgh, United Kingdom
Duration: 22 Apr 2016 → …


Conference 2nd Scottish Biomedical Postdoctoral Researcher Conference
Country/TerritoryUnited Kingdom
Period22/04/16 → …

Keywords / Materials (for Non-textual outputs)

  • Major Depressive Disorder
  • Prediction
  • Clinical
  • family history
  • Demographic factors
  • Lifetyle factors
  • Genomic factors
  • Machine learning
  • Clinical utility
  • Diagnostic accuracy
  • Recurrent major depressive disorder
  • Lifetime major depressive disorder


Dive into the research topics of 'Machine learning can improve prediction of lifetime major depressive disorder in Generation Scotland: Scottish Family Health Study'. Together they form a unique fingerprint.

Cite this