TY - JOUR
T1 - Mapping of machine learning approaches for description, prediction, and causal inference in the social and health sciences
AU - Leist, Anja K
AU - Klee, Matthias
AU - Kim, Jung Hyun
AU - Rehkopf, David H
AU - Bordas, Stéphane P A
AU - Muniz-Terrera, Graciela
AU - Wade, Sara
N1 - Funding Information:
We would like to thank B. Wilbertz, Z. Zhalama, T. Q. B. Chaves, as well as the DEMON (demondementia.com) Prevention Working Group and our colleagues at IRSEI for helpful discussions when we started the work on this manuscript in 2021. We also thank editor B. Hammer and two anonymous reviewers for comments on earlier versions of this manuscript. This work was supported by the European Research Council (grant agreement no. 803239 to A.K.L.). S.W. is a Royal Society of Edinburgh (RSE) Sabbatical Research Grant Holder; this work was supported by the RSE under grant 69938. Author contributions: A.K.L. conceptualized the paper. A.K.L., M.K., J.H.K., and S.W. provided methodology. M.K. and S.W. visualized the concepts and approaches. A.K.L., M.K., J.H.K., G.M.-T. and S.W. wrote the original draft. All authors reviewed and edited the draft and approved of the final version.
Publisher Copyright:
Copyright © 2022 The Authors, some rights reserved.
PY - 2022/10/19
Y1 - 2022/10/19
N2 - Machine learning (ML) methodology used in the social and health sciences needs to fit the intended research purposes of description, prediction, or causal inference. This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, counterfactual prediction, and causal structural learning to common research goals, such as estimating prevalence of adverse social or health outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes, and explain common ML performance metrics. Such mapping may help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
AB - Machine learning (ML) methodology used in the social and health sciences needs to fit the intended research purposes of description, prediction, or causal inference. This paper provides a comprehensive, systematic meta-mapping of research questions in the social and health sciences to appropriate ML approaches by incorporating the necessary requirements to statistical analysis in these disciplines. We map the established classification into description, prediction, counterfactual prediction, and causal structural learning to common research goals, such as estimating prevalence of adverse social or health outcomes, predicting the risk of an event, and identifying risk factors or causes of adverse outcomes, and explain common ML performance metrics. Such mapping may help to fully exploit the benefits of ML while considering domain-specific aspects relevant to the social and health sciences and hopefully contribute to the acceleration of the uptake of ML applications to advance both basic and applied social and health sciences research.
U2 - 10.1126/sciadv.abk1942
DO - 10.1126/sciadv.abk1942
M3 - Review article
C2 - 36260666
SN - 2375-2548
VL - 8
SP - eabk1942
JO - Science Advances
JF - Science Advances
IS - 42
ER -