Methodological issues in electronic healthcare database studies of drug cancer associations: identification of cancer, and drivers of discrepant results

Rañopa, ME; (2016) Methodological issues in electronic healthcare database studies of drug cancer associations: identification of cancer, and drivers of discrepant results. PhD thesis, London School of Hygiene & Tropical Medicine. DOI:

Text - Accepted Version

Download (5MB) | Preview


Background: There have been a number of conflicting findings from epidemiological studies investigating the association of drug use and cancer risk. Methodological issues such as biased study designs and differences in case identification have been postulated as potential reasons for differing results. However, the impact of these methodological variants is unclear. Aims: The principal aims of this thesis were to develop and validate case definitions that identified incident cancer diagnosis in the Clinical Practice Research Datalink (CPRD), and to measure and compare the impact of several potential drivers of conflicting findings within a practical setting. Methods: Firstly, for breast, colorectal, lung, and prostate cancer, two sets of incidence rates were estimated and compared to national estimates: (i) based on cancers identified in the CPRD; and (ii) estimates from the CPRD incorporating linked cancer registry data. Secondly, the statin-cancer association was investigated as an exemplar, and several potential drivers of conflicting findings were examined including study bias, case definitions, and data linkage. Study bias included immortal time, protopathic, prevalent user, healthy user, and time-window bias. Results: Cancer incidence rates based on the CPRD alone were lower compared to national estimates across all cancer types. Compared to national estimates, incidence rates incorporating linked cancer registry data were similar for colorectal and lung cancer, but higher for breast and prostate cancer. Of the seven potential drivers of discrepant results in the example study of statins and cancer, only time-window bias yielded substantial and consistent biased effects, with bias towards a protective association and corrected analyses yielding a null association. Immortal time, protopathic, prevalent user, and healthy user bias had minimal impact on the estimated association between statin use and cancer risk. Conclusions: CPRD cancer incidence rates were lower compared to national estimates. Incorporating linked cancer registry data, breast and prostate cancer incidence rates were higher than expected, implying that a proportion of the cancer cases identified in the CPRD were either false-positive cases or not registered nationally. A number of common design flaws and decisions were postulated as drivers of discrepant results. However, in practical study settings these flaws and differences did not uniformly lead to large changes in the estimated association of statin use and cancer risk.

Item Type: Thesis
Thesis Type: Doctoral
Thesis Name: PhD
Contributors: Bhaskaran, K (Thesis advisor); Douglas, I (Thesis advisor);
Faculty and Department: Faculty of Epidemiology and Population Health > Dept of Non-Communicable Disease Epidemiology
Funders: Clinical Practice Research Datalink
Copyright Holders: Michael Rañopa


Download activity - last 12 months
Downloads since deposit
Accesses by country - last 12 months
Accesses by referrer - last 12 months
Impact and interest
Additional statistics for this record are available via IRStats2

Actions (login required)

Edit Item Edit Item