Handling Class Imbalance in Machine Learning-based Prediction Models: A Case Study in Asthma Management

Arif Budiarto*, Aziz Sheikh, Andrew Wilson, David B. Price, Syed Ahmar Shah

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

A data-driven prediction tool has the potential to provide early warning of an asthma attack and improve asthma management and outcomes. Most previous machine learning (ML)-based studies for asthma attack prediction have reported a severe class imbalance, with major implications for model performance. We aimed to undertake a systematic comparison of several class imbalance handling techniques in the context of risk prediction models for asthma prognosis. We used data from 9,835 asthma patients extracted from the Medical Information Mart for Intensive Care (MIMIC) IV database and deployed five class imbalance handling methods based on synthetic minority oversampling technique (SMOTE) and cost function customisation. We then compared their performances in improving two-class classifier models developed using logistic regression (LR) and extreme gradient boosting (XGBoost) for three different prediction tasks with varying severity of class imbalance (proportion of majority class ranging from 90.86% to 98.98%). The cost function customisation technique substantially outperformed the SMOTE-based methods in all tasks. XGBoost combined with cost function customisation achieved the highest prediction performance for the outcome with the most extreme class imbalance ratio (AUC = 0.72). Our findings suggest that the cost function customisation-based approach to tackle class imbalance provides substantially better performance compared to oversampling in the context of asthma management.Clinical Relevance-This study underscores the challenge of class imbalance in the context of prediction tools to improve asthma management and outcomes and provides a methodological solution that addresses the challenge. Accurate asthma prediction tools can provide early warning and potentially prevent deterioration thereby improving the quality of life of patients with asthma.

Original languageEnglish
Title of host publication2023 45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers
Number of pages5
ISBN (Electronic)9798350324471
DOIs
Publication statusPublished - 27 Jul 2023
Event45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023 - Sydney, Australia
Duration: 24 Jul 202327 Jul 2023

Publication series

NameProceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS
PublisherIEEE
ISSN (Print)1557-170X
ISSN (Electronic)2694-0604

Conference

Conference45th Annual International Conference of the IEEE Engineering in Medicine and Biology Conference, EMBC 2023
Country/TerritoryAustralia
CitySydney
Period24/07/2327/07/23

Fingerprint

Dive into the research topics of 'Handling Class Imbalance in Machine Learning-based Prediction Models: A Case Study in Asthma Management'. Together they form a unique fingerprint.

Cite this