Identifying Risk Factors for Acute Asthma Attacks: Application of Machine Learning in a Country-Wide Mobile Health Study

Kevin Cheuk Him Tsang, Hilary Pinnock, Andrew M. Wilson, Syed Ahmar Shah

Research output: Working paperPreprint

Abstract / Description of output

Asthma is a variable long-term condition that affects 339 million people worldwide who are at risk of acute deteriorations or attacks. Because triggers, patterns, and risk of attacks vary from person to person, asthma can be difficult to study in small cohorts, but recent mobile-based studies like the Asthma Mobile Health Study (AMHS) provide an important opportunity to collect data from large populations. The AMHS is a publicly available dataset collected using a smartphone app from 10,010 asthma patients across the United States.
Using data-driven methods, we aimed to identify different clusters of asthma patients based on patterns of clinical deterioration that may lead to loss of productivity, and determine key factors associated with each patient cluster.
Based on existing asthma knowledge, 27 variables about the patient’s history, demographics, behaviour, and self-reported symptoms were extracted to generate 63 features. Of the 63 features, 10 were markers of attacks that were used to cluster patients with the k-means algorithm. We subsequently used a supervised learning approach, least absolute shrinkage and selection operator (LASSO), to rank the remaining 53 features and identify key risk factors associated with each patient cluster. The models were validated with 10-fold cross-validation.
Using data from 827 participants of AMHS with sufficient data, k-means clustering formed four patient clusters based on unscheduled healthcare usage and missed work. The most important factors contributing to the clustering were nocturnal symptoms, activity limitation, and sex. Being female, and having asthma that affects sleep and activity levels, were the key risk factors associated with having an asthma attack that necessitates the need for unscheduled medical care and time off work. Our internal validation resulted in an area under the curve (AUC) of up to 0.80.
The data-driven approach found risk factors associated with increased levels of asthma attacks that reflected those recognised in clinical practice. Future research about asthma risk factors should include these measures and also consider including work and school absence as markers of
Original languageUndefined/Unknown
Publication statusPublished - 10 Aug 2021

Cite this