Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data.

Welch, CA; Petersen, I; Bartlett, JW; White, IR; Marston, L; Morris, RW; Nazareth, I; Walters, K; Carpenter, J; (2014) Evaluation of two-fold fully conditional specification multiple imputation for longitudinal electronic health record data. Statistics in medicine, 33 (21). pp. 3725-37. ISSN 0277-6715 DOI: https://doi.org/10.1002/sim.6184

Full text not available from this repository. (Request a copy)


Most implementations of multiple imputation (MI) of missing data are designed for simple rectangular data structures ignoring temporal ordering of data. Therefore, when applying MI to longitudinal data with intermittent patterns of missing data, some alternative strategies must be considered. One approach is to divide data into time blocks and implement MI independently at each block. An alternative approach is to include all time blocks in the same MI model. With increasing numbers of time blocks, this approach is likely to break down because of co-linearity and over-fitting. The new two-fold fully conditional specification (FCS) MI algorithm addresses these issues, by only conditioning on measurements, which are local in time. We describe and report the results of a novel simulation study to critically evaluate the two-fold FCS algorithm and its suitability for imputation of longitudinal electronic health records. After generating a full data set, approximately 70% of selected continuous and categorical variables were made missing completely at random in each of ten time blocks. Subsequently, we applied a simple time-to-event model. We compared efficiency of estimated coefficients from a complete records analysis, MI of data in the baseline time block and the two-fold FCS algorithm. The results show that the two-fold FCS algorithm maximises the use of data available, with the gain relative to baseline MI depending on the strength of correlations within and between variables. Using this approach also increases plausibility of the missing at random assumption by using repeated measures over time of variables whose baseline values may be missing. © 2014 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd.

Item Type: Article
Faculty and Department: Faculty of Epidemiology and Population Health > Dept of Medical Statistics
Research Centre: Centre for Statistical Methodology
PubMed ID: 24782349
Web of Science ID: 340423700008
URI: http://researchonline.lshtm.ac.uk/id/eprint/1883891


Download activity - last 12 months
Downloads since deposit
Accesses by country - last 12 months
Accesses by referrer - last 12 months
Impact and interest
Additional statistics for this record are available via IRStats2

Actions (login required)

Edit Item Edit Item