Prevalence and risk factors for long COVID among adults in Scotland using electronic health records: a national, retrospective, observational cohort study

Karen Jeffrey, Lana Woolford, Rishma Maini, Siddharth Basetti, Ashleigh Batchelor, David Weatherill, Christopher White, Victoria Hammersley, Tristan Millington, Calum Macdonald, Jennifer Quint, Robin Kerr, Steven Kerr, Syed Ahmar Shah, Igor Rudan, Adeniyi Francis Fagbamigbe, Colin Simpson, Srinivasa Vital Katikireddi, Chris Robertson, Lewis RitchieAziz Sheikh, Luke Daines*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract / Description of output

BACKGROUND: Long COVID is a debilitating multisystem condition. The objective of this study was to estimate the prevalence of long COVID in the adult population of Scotland, and to identify risk factors associated with its development.

METHODS: In this national, retrospective, observational cohort study, we analysed electronic health records (EHRs) for all adults (≥18 years) registered with a general medical practice and resident in Scotland between March 1, 2020, and October 26, 2022 (98-99% of the population). We linked data from primary care, secondary care, laboratory testing and prescribing. Four outcome measures were used to identify long COVID: clinical codes, free text in primary care records, free text on sick notes, and a novel operational definition. The operational definition was developed using Poisson regression to identify clinical encounters indicative of long COVID from a sample of negative and positive COVID-19 cases matched on time-varying propensity to test positive for SARS-CoV-2. Possible risk factors for long COVID were identified by stratifying descriptive statistics by long COVID status.

FINDINGS: Of 4,676,390 participants, 81,219 (1.7%) were identified as having long COVID. Clinical codes identified the fewest cases (n = 1,092, 0.02%), followed by free text (n = 8,368, 0.2%), sick notes (n = 14,469, 0.3%), and the operational definition (n = 64,193, 1.4%). There was limited overlap in cases identified by the measures; however, temporal trends and patient characteristics were consistent across measures. Compared with the general population, a higher proportion of people with long COVID were female (65.1% versus 50.4%), aged 38-67 (63.7% versus 48.9%), overweight or obese (45.7% versus 29.4%), had one or more comorbidities (52.7% versus 36.0%), were immunosuppressed (6.9% versus 3.2%), shielding (7.9% versus 3.4%), or hospitalised within 28 days of testing positive (8.8% versus 3.3%%), and had tested positive before Omicron became the dominant variant (44.9% versus 35.9%). The operational definition identified long COVID cases with combinations of clinical encounters (from four symptoms, six investigation types, and seven management strategies) recorded in EHRs within 4-26 weeks of a positive SARS-CoV-2 test. These combinations were significantly (p < 0.0001) more prevalent in positive COVID-19 patients than in matched negative controls. In a case-crossover analysis, 16.4% of those identified by the operational definition had similar healthcare patterns recorded before testing positive.

INTERPRETATION: The prevalence of long COVID presenting in general practice was estimated to be 0.02-1.7%, depending on the measure used. Due to challenges in diagnosing long COVID and inconsistent recording of information in EHRs, the true prevalence of long COVID is likely to be higher. The operational definition provided a novel approach but relied on a restricted set of symptoms and may misclassify individuals with pre-existing health conditions. Further research is needed to refine and validate this approach.

FUNDING: Chief Scientist Office (Scotland), Medical Research Council, and BREATHE.

Original languageEnglish
Article number102590
Number of pages13
Early online date11 Apr 2024
Publication statusPublished - 1 May 2024

Keywords / Materials (for Non-textual outputs)

  • Long COVID
  • Population Surveillance
  • Primary Health Care
  • Clinical Coding
  • Matched Pair Analysis


Dive into the research topics of 'Prevalence and risk factors for long COVID among adults in Scotland using electronic health records: a national, retrospective, observational cohort study'. Together they form a unique fingerprint.
  • 2023 Florence Nightingale Award for Excellence in Healthcare Data Analytics

    Jeffrey, Karen (Recipient), Daines, Luke (Recipient), Woolford, Lana (Recipient), Maini, Rishma (Recipient), Pandya, Anouska (Recipient), Sagar, Debbie (Recipient), Borland, Jane (Recipient), Basetti, Siddharth (Recipient), Batchelor, Ashleigh (Recipient), Weatherill, David (Recipient), White, Christopher (Recipient), Hammersley, Victoria (Recipient), Millington, Tristan (Recipient), Macdonald, Calum (Recipient), Kerr, Steven (Recipient), Shi, Ting (Recipient), Quint, Jennifer K (Recipient), Linning, Gabriella (Recipient), Murray, Josie (Recipient), Shankar-Hari, Manu (Recipient), Kerr, Robin (Recipient), Watson, Bruce (Recipient), Shah, Ahmar (Recipient), Hameed, Safraj Shahul (Recipient), Fagbamigbe, Adeniyi (Recipient), Kelly, Dave (Recipient), Simpson, Colin (Recipient), Vital Katikireddi, Srinivasa (Recipient), Robertson, Chris (Recipient), Ritchie, Lewis D (Recipient) & Sheikh, Aziz (Recipient), 6 Jul 2023

    Prize: Prize (including medals and awards)

Cite this