Defining cases of asthma, eczema and allergic rhinitis using electronic health records in the Born in Bradford birth cohort

Key Messages • In electronic health records, the accuracy of diagnostic codes to define outcomes can be uncertain • The accuracy can vary in different settings, doctors and practices, even with validated codes • We recommend definitions combining codes previously described and other codes available in the records


Defining cases of asthma, eczema and allergic rhinitis using electronic health records in the Born in Bradford birth cohort
To the Editor, In studies based on electronic health record (EHR) databases, diagnostic codes are commonly used to define clinical outcomes.
However, the accuracy of the codes depends on several factors, such as whether the medical diagnosis is correct and the opportunity for physical examination (ascertainment process), and validity can vary between datasets. 1,2 The diagnosis of asthma and allergic diseases (AAD) in young children is particularly challenging: the symptoms are intermittent and the differential diagnosis is difficult. 3 Therefore, most diagnoses rely on response to treatment and parental report of symptoms that can be influenced by past experiences of diseases in the children and parents, which in turn can lead to recall bias. The impact of disease misclassification can be important depending on whether it is differential or non-differential, and whether it is dependent on other errors. 4 We recently analysed the association between exposure to antibiotics and the risk of AAD (asthma, atopic eczema, and allergic rhinitis 5,6 ), in children participating in the Born in Bradford (BiB) birth cohort study. 7 Briefly, 12,453 pregnant women were recruited to BiB between 2007 and 2010, resulting in the births of over 13,500 children. Consent for health record linkage was obtained, and has been achieved for approximately 98% of participants. In total, 13,044 children were linked to EHR. The protocol for the antibiotics study, written before the study started, can be found in reference 5. 5 In this letter, we present our approach to defining AAD outcomes using CTV3 Read codes (coded clinical terms designed for use in EHR in the NHS in the UK) and British National Formulary (BNF) codes for prescriptions of medications.
Initially, we planned to follow the common practice of using only validated definitions described in previous studies using EHR to ensure comparability. However, we reflected over some issues: diagnostic procedures are not standardised; the codes used and their frequency can vary across different settings and doctors; and there could be cases that are not recorded with the validated codes.
Conversely, including all Read codes found in our EHR relating to our outcomes could lead to bias where Read codes are used for noncases (e.g., family history of asthma).
Using some of the methods recommended for developing clinical codelists, 8 we first conceived conceptual definitions for each disease based on available data. Then, we searched for diagnoses in our EHR database in two ways: (1) using diagnostic codes described in previous studies, and (2) using case-insensitive text mining of the term definitions that accompany Read codes that could indicate diagnosis of AAD. For asthma, we found a large number of terms that required us to adopt a pragmatic approach to short listing. The authors SSC and LP selected all codes describing diagnoses, current adherence to treatment and control assessments, and excluded those describing asthma screening or which were considered too vague. For atopic eczema and hay fever, all codes found were related to the diagnosis and did not require the steps we employed for asthma. Additionally, we searched for BNF codes for the most common medications used to treat AAD (including generic and brand names). We discussed our definitions and lists of Read/BNF codes with clinicians and other researchers with expertise in AAD and agreed on the final definitions.
To deal with levels of uncertainty of whether or not the presence of a Read code for AAD reflected a confirmed diagnosis of AAD, we created two definitions for each outcome. The first definition was regarded as being more specific compared to the second for asthma and atopic eczema. The final case definitions are detailed in Table 1, and the CTV3 Read codes can be found at https://doi.org/10.17037/ DATA.00003098. For asthma, differential diagnosis can be challenging in children under 5 years of age. We therefore based our first definition on (1) those with selected Read codes at 6 years old irrespective of prescriptions, or (2) those with Read codes only between 3 and 5 years but with regular prescriptions for asthma at ≥ 6 years of age. This demonstrates repeated prescriptions when the diagnosis is made with more certainty. The second definition was defined by the presence of the selected Read codes at age 3 years or older, irrespective of a prescription being issued. For atopic eczema, we selected CTV3 Read codes adapted from previous studies. 9 We excluded infants definition, regarded as more specific, was restricted to a subset of children for whom a Read code was present between 1 March and 31 July and who were also prescribed medication. This period corresponds to the season of allergic rhinitis induced by pollen allergy (hay fever).
We compared the different prescriptions used in our case definitions against the Read codes as the reference standard, using positive and negative predictive values (PPV and NPV, respectively). 2 PPV represents the proportion of children who had a Read code out of those who were prescribed medication; NPV refers to the proportion who did not have a Read code out of those who were not prescribed medication. For asthma, where all medications are dispensed on prescription, the PPV was high for inhaled corticosteroids and leukotriene receptor antagonists. The NPV was high for all three medications, with bronchodilators and inhaled corticosteroids being prescribed for most cases ( Table 2). This suggests that prescriptions as additional criteria can be of particular relevance for asthma. All In conclusion, we have presented case definitions for asthma, atopic eczema and allergic rhinitis using UK EHR data on GP-recorded diagnoses and prescriptions. We combined definitions adapted from previous studies with text mining of Read codes and amended our definitions in consultation with experts in this field. We recommend that researchers consider using this approach in similar UK studies to deal with uncertainties in the case definitions.

Key Messages
• In electronic health records, the accuracy of diagnostic codes to define outcomes can be uncertain   all the participants, health professionals and researchers who have made Born in Bradford happen.

CO N FLI C T O F I NTER E S T S TATEM ENT
None of the authors have conflicts of interest to declare.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are openly avail-