Setting
The province of Manitoba, located in central Canada, has a population of 1.37 million (as of 2019) [11]. Approximately 55% of the population live in the capital city of Winnipeg. Manitoba Health, Seniors and Active Living (MHSAL), the publicly funded provincial health insurance agency, provides comprehensive universal health coverage for hospitalizations, procedures, and physician visits for provincial residents. MHSAL maintains several electronic databases to monitor health care use and reimburse health care providers for services delivered. Since 1984, provincial residents have been assigned a personal health identification number (PHIN) which can be used to link provincial health information databases allowing health care utilization and outcomes to be tracked longitudinally.
Data sources
The MCR was used to identify individuals diagnosed with breast or colorectal cancer, cancer diagnosis date, age at diagnosis, cancer stage, estrogen receptor/progesterone receptor (ER/PR) and human epidermal growth factor receptor 2 (HER-2) status, date and type of cancer surgery, and date of the first radiation treatment for each course of radiation treatment. The CancerCare Manitoba (CCMB) electronic medical record is the record of clinical cancer interactions, investigations, and treatment, and was used to determine dates and types of systemic anti-cancer medical therapy as well as carcinoembryonic antigen (CEA) and cancer antigen 15–3 (Ca 15–3) blood test results.
We used three MHSAL administrative databases: the Manitoba Population Registry, the Medical Claims database, and the Drug Program Information Network (DPIN) database. The Manitoba Population Registry contains demographic, vital status, and migration information and was used to determine the start and end dates of provincial health coverage. The Medical Claims database is generated by claims filed by health care providers for reimbursement of service and includes services provided, diagnosis, provider, and service date. Medical Claims data were used to determine palliative care consultations. The DPIN database includes all prescriptions dispensed from outpatient pharmacies in Manitoba. DPIN data was used to determine capecitabine, a chemotherapy drug used to treat different cancers including breast and colorectal cancer. Laboratory data were obtained from Shared Health, Manitoba’s public sector laboratory, to identify CEA and Ca15–3 blood test results which were not already in the CCMB medical record. The accuracy and completeness of Manitoba Health’s administrative data has been previously established [12,13,14].
Study population
The study included individuals diagnosed with stage I-III colorectal cancer (International Classification of Diseases, Oncology 3rd edition (ICD-O-3) codes C18.0. C18.2–9, C19, C20, C26.0) or breast cancer (ICD-O-3 codes C50.0–6, C50.8–9). Stage IV cases, which have metastasis at diagnosis, were excluded as these individuals develop progression (i.e., worsening disease) rather than recurrence.
The study population was divided into a training cohort of individuals diagnosed from 2004 to 2007 and a validation cohort of individuals diagnosed from 2008 to 2012. Breast and colorectal cancers were analyzed separately. The breast cancer training cohort included cancers that were either ER negative, PR negative, or HER-2 positive because these cancers have a higher recurrence rate and therefore decreased the number of cases needed to review [15]. The colorectal cancer training cohort focused on stage II and III because they are expected to have higher rates of recurrence compared to stage I cancers [16]. The breast and colorectal cancer validation cohorts included individuals diagnosed with stage I-III cancers. However, the breast cancer cohort was oversampled with ER negative, PR negative, and HER-2 cases and the colorectal cancer cohort was oversampled with higher stages to ensure that enough recurrences were identified. The validation cohort included individuals diagnosed in later years to provide external validation, which is a more rigorous validation method than internal or apparent validation [17,18,19].
Study variables
Study variables are summarized in Table 1. Cancer recurrence included loco-regional (reappearance of cancer in the same region of the body or the lymph nodes) and distant (reappearance of cancer in another part of the body) recurrence. Surgery and radiation treatment data were linked by a tumour ID which identifies the treatment associated with a specific tumour. Therefore, if an individual had more than one cancer diagnosis, the treatment data could be linked to the appropriate cancer. The remaining variables could not be linked to a specific tumour. To increase accuracy in classifying the remaining variables, conditions were added. Surgery (mastectomy, lumpectomy, axillary lymph node dissection for breast cancer and bypass or resection surgery for CRC) beyond 12 months after diagnosis were included to capture local recurrence after the use of neoadjuvant treatments. New disease within 1 year is usually considered part of the primary diagnosis. Surgery for a non-breast site or liver or lung resections beyond 6 months were included to capture treatment for metastases after the initial treatment. The receipt of chemotherapy beyond 12 months of diagnosis and RT beyond 12 months for breast cancer were considered due to recurrence unless the treatments occurred after a second primary treated with surgery. Although chemotherapy treatment could not be linked to a specific tumour, the treatment site was identified which increased accuracy of correct association. A palliative care consult was considered due to recurrence if it was provided by an oncologist and was linked to a breast cancer (breast cancer cases only), colorectal cancer (colorectal cancer cases only), lung cancer, liver cancer, or undetermined metastases beyond 6 months after diagnosis to exclude any treatment discussions that may have occurred after diagnosis. Elevated blood markers (CEA > 10; Ca 15–3 > 50) often related to recurrence more than 12 months after diagnosis were considered due to recurrence unless they occurred within 3 months of another primary cancer diagnosis. Elevated blood markers prior to 12 months were not included to avoid initial elevations due to the original diagnosis or treatment.
Table 1 Variables included in the study Algorithm development and validation
A chart review was first conducted by trained research assistants to identify cancer recurrence. A duplicate chart review by a research assistant who did not conduct the initial chart review was conducted for a fraction of the cohort (10%) to evaluate inter-rater reliability. The algorithms were then developed by analyzing the same cohorts using two approaches: pre-specified variables and conditional inference trees. The pre-specified variable approach used variables and clinically meaningful cut offs determined prior to the start of the study. Variables and cut-offs were selected with information from previous studies and local cancer experts. For this algorithm, if an individual was positive for any of the variables included, they were predicted to have a recurrence. The conditional inference tree approach is an automated machine learning technique that explicitly states the algorithm that was developed, which is not achieved with other machine learning techniques. The conditional inference trees used the same variables as the pre-defined algorithm. However, trees were created based on the association between each covariate and the outcome of interest (i.e., recurrence). The ctree function within the party R package with a default setting (a quadratic test statistic, Bonferroni-adjusted p-values, and criterion p-value of 0.05) [20]. Validation cohorts were used to determine if the algorithms developed were generalizable to cancer cohorts independent of those analyzed as part of the algorithm development.
Performance metrics
Sensitivity (the percentage of individuals who had a recurrence that were correctly identified), specificity (the percentage of individuals who did not have a recurrence that were correctly identified), positive predictive value (PPV) (percentage of individuals predicted to have recurrence that truly have recurrence), negative predictive value (NPV) (percentage of individuals predicted to not have recurrence that truly do not have recurrence), correct classification (the percentage of individuals who were correctly classified as having a recurrence or not having a recurrence), and scaled Brier scores were calculated to determine algorithm accuracy. These classification measures are commonly used in the literature to describe the performance of algorithms and models. Brier scores measure the predictive accuracy by subtracting the predictive values from the outcome values (the average of squared differences between predicted values and outcome values). The Brier score was then scaled to the proportion of events (p) in the cohort 1-(Brier score/(mean(p)*(1-mean(p)))) where a value of 1 is perfect prediction, a value of 0 is chance, and a negative value is worse than chance) [21, 22]. Therefore, unlike measures like sensitivity, specificity, and correct classification, the scaled Brier score is adjusted for the prevalence of events in the cohort. This is advantageous in measuring accuracy over commonly used classification measures which ignore prevalence. Measures were unweighted for both training and validation cohorts. Weighted measures were also calculated for the validation cohort to account for oversampling. Confidence intervals were determined using the permutation method, including 1000 replications and reporting values at the 2.5th and 97.5th percentiles.