Deriving stage at diagnosis from multiple population-based sources: colorectal and lung cancer in England.

Benitez-Majano, S; Fowler, H; Maringe, C; Di Girolamo, C; Rachet, B; (2016) Deriving stage at diagnosis from multiple population-based sources: colorectal and lung cancer in England. British journal of cancer, 115 (3). pp. 391-400. ISSN 0007-0920 DOI:

Text - Published Version

Download (887kB) | Preview
Text - Accepted Version

Download (2MB) | Preview
Text - Accepted Version

Download (1MB) | Preview


: Stage at diagnosis is a strong predictor of cancer survival. Differences in stage distributions and stage-specific management help explain geographic differences in cancer outcomes. Stage information is thus essential to improve policies for cancer control. Despite recent progress, stage information is often incomplete. Data collection methods and definition of stage categories are rarely reported. These inconsistencies may result in assigning conflicting stage for single tumours and confound the interpretation of international comparisons and temporal trends of stage-specific cancer outcomes. We propose an algorithm that uses multiple routine, population-based data sources to obtain the most complete and reliable stage information possible.<br/> : Our hierarchical approach derives a single stage category per tumour prioritising information deemed of best quality from multiple data sets and various individual components of tumour stage. It incorporates rules from the Union for International Cancer Control TNM classification of malignant tumours. The algorithm is illustrated for colorectal and lung cancer in England. We linked the cancer-specific Clinical Audit data (collected from clinical multi-disciplinary teams) to national cancer registry data. We prioritise stage variables from the Clinical Audit and added information from the registry when needed. We compared stage distribution and stage-specific net survival using two sets of definitions of summary stage with contrasting levels of assumptions for dealing with missing individual TNM components. This exercise extends a previous algorithm we developed for international comparisons of stage-specific survival.<br/> : Between 2008 and 2012, 163 915 primary colorectal cancer cases and 168 158 primary lung cancer cases were diagnosed in adults in England. Using the most restrictive definition of summary stage (valid information on all individual TNM components), colorectal cancer stage completeness was 56.6% (from 33.8% in 2008 to 85.2% in 2012). Lung cancer stage completeness was 76.6% (from 57.3% in 2008 to 91.4% in 2012). Stage distribution differed between strategies to define summary stage. Stage-specific survival was consistent with published reports.<br/> : We offer a robust strategy to harmonise the derivation of stage that can be adapted for other cancers and data sources in different countries. The general approach of prioritising good-quality information, reporting sources of individual TNM variables, and reporting of assumptions for dealing with missing data is applicable to any population-based cancer research using stage. Moreover, our research highlights the need for further transparency in the way stage categories are defined and reported, acknowledging the limitations, and potential discrepancies of using readily available stage variables.<br/>

Item Type: Article
Faculty and Department: Faculty of Epidemiology and Population Health > Dept of Non-Communicable Disease Epidemiology
Research Centre: Cancer Survival Group
Related URLs:
PubMed ID: 27328310
Web of Science ID: 380380400016


Download activity - last 12 months
Downloads since deposit
Accesses by country - last 12 months
Accesses by referrer - last 12 months
Impact and interest
Additional statistics for this record are available via IRStats2

Actions (login required)

Edit Item Edit Item