Validation of algorithms identifying diagnosed obstructive sleep apnoea and narcolepsy in coded primary care and linked hospital activity data in England

Helen Strongman ORCID logo ; Sofia H Eriksson ; Kwabena Asare ; Michelle A Miller ; Martina Sýkorová ; Hema Mistry ; Kristin Veighey ; Charlotte Warren-Gash ORCID logo ; Krishnan Bhaskaran ORCID logo ; (2025) Validation of algorithms identifying diagnosed obstructive sleep apnoea and narcolepsy in coded primary care and linked hospital activity data in England. Sleep Epidemiology, 5. p. 100110. ISSN 2667-3436 DOI: 10.1016/j.sleepe.2025.100110
Copy

Purpose: To assist sleep epidemiology research, we created and tested the accuracy of five algorithms identifying diagnosed Obstructive Sleep Apnoea (OSA) and narcolepsy in routinely collected data from England (01/01/1998–29/03/2021).

Methods: The primary algorithm identified the first coded record in Clinical Practice Research Datalink (CPRD) primary care or linked hospital admissions data as an incident diagnosis of OSA (n = 92,222) or narcolepsy (n = 1072). Alternative algorithms required codes in CPRD, both datasets, or an additional proximate possible-sleep-related outpatient visit or excessive daytime sleepiness drug prescription (narcolepsy only). Staff in 73/1574 CPRD practices completed online questionnaires for a convenience sample of 144 OSA and 101 narcolepsy cases. We estimated Positive Predictive Values (PPVs) describing the proportion of cases confirmed by a gold standard hospital specialist diagnosis, the percentage of gold standard cases from the primary algorithm retained with alternative algorithms, and time between specialist and recorded diagnosis dates.

Results: Using the primary algorithm, the PPV (95 % CI) was 75.3 % (69.2–81.3) and 65.2 % (57.0–73.4) for OSA and narcolepsy, respectively: 80.6 % and 62.7 % of confirmed cases were recorded within 6 months of the specialist diagnosis. The CPRD-only algorithm increased the PPV to 85.3 (77.3–91.4, OSA) and 71.0 (58.8–81.3, narcolepsy) and retained high proportions of gold standard cases. Requiring additional outpatient or prescribing data increased PPVs, and for OSA improved diagnostic date accuracy, but omitted a high proportion of gold standard cases.

Conclusion: Highly accurate OSA diagnoses can be identified in routinely collected data. Recorded cases of narcolepsy are moderately accurate, but diagnosis dates are not.


picture_as_pdf
Strongman-etal-2025-Validation-of-algorithms.pdf
subject
Published Version
Available under Creative Commons: Attribution 4.0

View Download

Atom BibTeX OpenURL ContextObject in Span Multiline CSV OpenURL ContextObject Dublin Core Dublin Core MPEG-21 DIDL Data Cite XML EndNote HTML Citation JSON MARC (ASCII) MARC (ISO 2709) METS MODS RDF+N3 RDF+N-Triples RDF+XML RIOXX2 XML Reference Manager Refer Simple Metadata ASCII Citation EP3 XML
Export

Downloads