Leveraging multiple machine learning techniques to predict major life outcomes from a small set of psychological and socioeconomic variables: A combined bottom-up/top-down approach

Research output: Contribution to journalArticlepeer-review

Abstract

Predicting longitudinal outcomes from thousands of variables across multiple waves provides impressive opportunities to identify variables of importance, but what is the most efficient way to carry out such analyses on hundreds or thousands of variables? As part of the Fragile Families Challenge, a series of analyses were conducted that aimed at identifying a few reliable, important variables, primarily with machine learning approaches given minimal oversight. Using generalized boosted models, random forests, and elastic net regression models, these analyses identified a consistent set of psychological and socioeconomic factors that yielded strong prediction scores in generalized linear models. These results demonstrate that relatively simple models fitted to the Fragile Families data can generate predictions that perform close to state-of-the-art predictive models.
Original languageEnglish
Pages (from-to)1-9
JournalSocius: Sociological Research for a Dynamic World
Volume5
DOIs
Publication statusPublished - 10 Sep 2019

Keywords

  • socioeconomic disadvantage
  • cognitive ability
  • variable selection
  • prediction
  • family background

Fingerprint

Dive into the research topics of 'Leveraging multiple machine learning techniques to predict major life outcomes from a small set of psychological and socioeconomic variables: A combined bottom-up/top-down approach'. Together they form a unique fingerprint.

Cite this