Background Up to half of patients with dementia may not receive a formal diagnosis, limiting access to appropriate services. It is hypothesised that it may be possible to identify undiagnosed dementia from a profile of symptoms recorded in routine clinical practice. Aim The aim of this study is to develop a machine learning-based model that could be used in general practice to detect dementia from routinely collected NHS data. The model would be a useful tool for identifying people who may be living with dementia but have not been formally diagnosed. Design & setting The study involved a case-control design and analysis of primary care data routinely collected over a 2-year period. Dementia diagnosed during the study period was compared to no diagnosis of dementia during the same period using pseudonymised routinely collected primary care clinical data. Method Routinely collected Read-encoded data were obtained from 18 consenting GP surgeries across Devon, for 26 483 patients aged >65 years. The authors determined Read codes assigned to patients that may contribute to dementia risk. These codes were used as features to train a machine-learning classification model to identify patients that may have underlying dementia. Results The model obtained sensitivity and specificity values of 84.47% and 86.67%, respectively. Conclusion The results show that routinely collected primary care data may be used to identify undiagnosed dementia. The methodology is promising and, if successfully developed and deployed, may help to increase dementia diagnosis in primary care.