The power of data mining in diagnosis of childhood pneumonia

Elina Naydenova*, Athanasios Tsanas, Stephen Howie, Climent Casals-Pascual, Maarten De Vos

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


Childhood pneumonia is the leading cause of death of children under the age of 5 years globally. Diagnostic information on the presence of infection, severity and aetiology (bacterial versus viral) is crucial for appropriate treatment. However, the derivation of such information requires advanced equipment (such as X-rays) and clinical expertise to correctly assess observational clinical signs (such as chest indrawing); both of these are often unavailable in resource constrained settings. In this study, these challenges were addressed through the development of a suite of data mining tools, facilitating automated diagnosis through quantifiable features. Findings were validated on a large dataset comprising 780 children diagnosed with pneumonia and 801 age-matched healthy controls. Pneumonia was identified via four quantifiable vital signs (98.2% sensitivity and 97.6% specificity). Moreover, it was shown that severity can be determined through a combination of three vital signs and two lung sounds (72.4% sensitivity and 82.2% specificity); addition of a conventional biomarker (C-reactive protein) further improved severity predictions (89.1% sensitivity and 81.3% specificity). Finally, we demonstrated that aetiology can be determined using three vital signs and a newly proposed biomarker (lipocalin-2) (81.8% sensitivity and 90.6% specificity). These results suggest that a suite of carefully designed machine learning tools can be used to support multi-faceted diagnosis of childhood pneumonia in resource-constrained settings, compensating for the shortage of expensive equipment and highly trained clinicians.

Original languageEnglish
Article number20160266
Number of pages10
JournalJournal of the Royal Society, Interface
Issue number120
Early online date27 Jul 2016
Publication statusE-pub ahead of print - 27 Jul 2016


  • childhood pneumonia
  • machine learning
  • diagnostics


Dive into the research topics of 'The power of data mining in diagnosis of childhood pneumonia'. Together they form a unique fingerprint.

Cite this