MethodsIn this machine learning approach, we applied supervised machine learning, using regularised regression and nested leave-one-site-out cross-validation, to baseline clinical data from the English Evaluating the Development and Impact of Early Intervention Services (EDEN) study (n=1027), to develop and internally validate prediction models at 1-year follow-up. We assessed four binary outcomes that were recorded at 1 year: symptom remission, social recovery, vocational recovery, and quality of life (QoL). We externally validated the prediction models by selecting from the top predictor variables identified in the internal validation models the variables shared with the external validation datasets comprised of two Scottish longitudinal cohort studies (n=162) and the OPUS trial, a randomised controlled trial of specialised assertive intervention versus standard treatment (n=578).
FindingsThe performance of prediction models was robust for the four 1-year outcomes of symptom remission (area under the receiver operating characteristic curve [AUC] 0·703, 95% CI 0·664–0·742), social recovery (0·731, 0·697–0·765), vocational recovery (0·736, 0·702–0·771), and QoL (0·704, 0·667–0·742; p<0·0001 for all outcomes), on internal validation. We externally validated the outcomes of symptom remission (AUC 0·680, 95% CI 0·587–0·773), vocational recovery (0·867, 0·805–0·930), and QoL (0·679, 0·522–0·836) in the Scottish datasets, and symptom remission (0·616, 0·553–0·679), social recovery (0·573, 0·504–0·643), vocational recovery (0·660, 0·610–0·710), and QoL (0·556, 0·481–0·631) in the OPUS dataset.
InterpretationIn our machine learning analysis, we showed that prediction models can reliably and prospectively identify poor remission and recovery outcomes at 1 year for patients with first-episode psychosis using baseline clinical variables at first clinical contact.