Linking Data for Mothers and Babies in De-Identified Electronic Health Data.

Harron, K; Gilbert, R; Cromwell, D; van der Meulen, J; (2016) Linking Data for Mothers and Babies in De-Identified Electronic Health Data. PloS one, 11 (10). e0164667. ISSN 1932-6203 DOI:

Text - Published Version

Download (1MB) | Preview


Linkage of longitudinal administrative data for mothers and babies supports research and service evaluation in several populations around the world. We established a linked mother-baby cohort using pseudonymised, population-level data for England. Retrospective linkage study using electronic hospital records of mothers and babies admitted to NHS hospitals in England, captured in Hospital Episode Statistics between April 2001 and March 2013. Of 672,955 baby records in 2012/13, 280,470 (42%) linked deterministically to a maternal record using hospital, GP practice, maternal age, birthweight, gestation, birth order and sex. A further 380,164 (56%) records linked using probabilistic methods incorporating additional variables that could differ between mother/baby records (admission dates, ethnicity, 3/4-character postcode district) or that include missing values (delivery variables). The false-match rate was estimated at 0.15% using synthetic data. Data quality improved over time: for 2001/02, 91% of baby records were linked (holding the estimated false-match rate at 0.15%). The linked cohort was representative of national distributions of gender, gestation, birth weight and maternal age, and captured approximately 97% of births in England. Probabilistic linkage of maternal and baby healthcare characteristics offers an efficient way to enrich maternity data, improve data quality, and create longitudinal cohorts for research and service evaluation. This approach could be extended to linkage of other datasets that have non-disclosive characteristics in common.

Item Type: Article
Faculty and Department: Faculty of Public Health and Policy > Dept of Health Services Research and Policy
Related URLs:
PubMed ID: 27764135
Web of Science ID: 386204500050


Download activity - last 12 months
Downloads since deposit
Accesses by country - last 12 months
Accesses by referrer - last 12 months
Impact and interest
Additional statistics for this record are available via IRStats2

Actions (login required)

Edit Item Edit Item