In March 2020, soon after the World Health Organization declared the coronavirus disease 2019 (COVID-19) outbreak a pandemic, US states and territories implemented policies and strategies limiting person-to-person interactions to reduce transmission in the community and to accommodate the surge of emergent healthcare needs. These measures led to a reduction in the number of cancer screenings, diagnoses, and treatments measured in healthcare encounters,1 Medicare claims,2 and electronic pathology reports.3

Although healthcare encounters, medical claims, and pathology reports can be used as proxies for the number of individuals screened, newly diagnosed, or treated for cancer, cancer data collected by certified tumor registrars following registry standards are the gold standard for accurately evaluating the impact of the pandemic on cancer care and outcomes.4

Importantly, the COVID-19 pandemic and associated community mitigation strategies might have impacted both the number of individuals diagnosed with cancer and the ability of cancer registrars to collect and report cancer data.

The aim of this study was to determine whether changes in the number of cancer cases reported to the National Cancer Database (NCDB) took place during 2020, the first year of the COVID-19 pandemic, and if identified, whether changes were significant, reflect alterations in cancer diagnoses, or limitations in in registrars’ ability to report data. Answering these questions is critical prior to incorporating 2020 data from the NCDB into research studies of people in the USA diagnosed with cancer.

Methods

The NCDB is a nationwide hospital-based cancer registry, which includes approximately 70% of all newly diagnosed cancers in the USA.5 NCDB data elements are abstracted by certified tumor registrars who undergo formal training through the National Cancer Registrars Association. Variable definitions are standardized with other cancer registries and participating sites undergo periodic data audits to ensure data reliability. Although the current data submission to the NCDB is real time (registrars continuously submit cases), registrars update submissions and add new information (e.g., treatment, date of last contact) to cases diagnosed in prior months. The last date of data input to the NCDB that was considered for the 2020 diagnosis year was 15 March 2022.

To evaluate changes in the number of cancer cases diagnosed in 2020, we first calculated total number of cancer diagnoses reported to the NCDB between January 2018 and December 2020. We then used the additive outlier method to identify changes (structural breaks) in natural logarithm of number of cancer diagnoses reported to the NCDB each month between January 2018 and December 2020 accounting for seasonality, regional variation, and random error.6

Next, we quantified the change in number of observed cancer cases diagnosed in 2020 compared with the expected number of cancer cases on the basis of historic patterns. We estimated the expected number of cancer cases diagnosed between January 2020 and December 2020 using monthly cancer counts of cancer diagnoses from January 2018 to December 2019 and the vector autoregressive method to account for seasonality and random error.7

We then estimated the difference between expected and observed number of cases diagnosed in 2020 using generalized estimating equation under assumptions of the Poisson distribution for count data. This analysis was conducted for all cancers combined (over 50 cancer sites, including one category for “other” cancer sites) and also separately for each of the 21 selected cancer sites: prostate, lung, breast, colorectum, bladder, non-Hodgkin lymphoma, melanoma, kidney, uterus, pancreas, oral cavity and pharynx, thyroid, stomach, brain, ovary, liver, lymphocytic leukemia, myeloma, myeloid and monocytic leukemia, esophagus, and cervix.

To evaluate the impact of the pandemic on the ability of registrars to collect cancer data in 2020, we first calculated the number of records with date of diagnosis (month and year) between September 2019 and December 2020. Then, we calculated the number of records with date initiated [i.e., date (month and year) the electronic abstract was created in the database] between September 2019 and December 2020. Finally, we calculated the number of records with date completed [i.e., date (month and year) when specified data elements are completed and pass relevant data quality checks] between September 2019 and December 2020 (Supplementary Fig. 1). This analysis included all records reported to the NCDB, including those for the same cancer diagnosis processed at different times and/or different facilities (duplicates) in an effort to evaluate registrars’ workload.

We hypothesized that if the pandemic impacted the ability of registrars to process cancer data, there would be a decline in the number of all three types of records (diagnosed, initiated, and completed) at the start of the pandemic. To test this hypothesis, we used interrupted time series analysis to identify immediate changes in the number of records processed each month after the COVID-19 pandemic was declared (March–June 2020) and tested for differences between natural logarithm of number of record by types (i.e., initiated and completed) using linear generalized estimating equation models and natural logarithm of number of records diagnosed as the reference group.8 All analyses were performed using SAS 9.4. Statistical significance was set at two-sided α = 0.05.

Results

Deviations from Trends in Monthly Number of Cancer Cases

In April 2020, there was a statistically significant structural break (decrease) in the number of cases diagnosed overall and for each cancer site compared with what was expected given previous years’ trend and seasonality (Table 1). This was the only statistically significant structural break from previous trends detected in 2020. In the months that followed, there was no evidence of recovery of absent cases as the subsequent number of cases diagnosed each month through December 2020 did not exceed projections (Fig. 1).

Table 1 Identified outliers (structural breaks) from number of cases expected to be diagnosed each month in 2020, NCDB
Fig. 1
figure 1figure 1

Decrease in expected number of cancer cases diagnosed in 2020

Deficit in Number of Cases Diagnosed in 2020

The absolute decrease in number of cancer cases reported to NCDB in 2020 was 174,293, leading to a 12.4% deficit compared with the expected number of cases (Table 2). Breast, lung, and prostate cancers had the largest absolute decrease in the number of cases (34,411; 24,246; and 28,349 fewer cases than expected in 2020, respectively). The greatest deficit in cancer diagnoses reported to the NCDB was observed among thyroid, melanoma, and prostate cancers (19.9%, 20.6%, and 19.5% lower compared with expected diagnoses in 2020, respectively).

Table 2 Absolute decrease in number of cancer diagnoses and deficit in cancer cases reported to the NCDB compared with expected in 2020

Assessment of Changes in Cancer Registrar Reporting

The deficit in cancer diagnoses reported to the NCDB in 2020 was not due to registrars’ inability to process records during the pandemic (Fig. 2). There was a statistically significant decline in the number of records with diagnosis date between March and April 2020. Similarly, there was a statistically significant decline in the number of records initiated by registrars (new cancer cases) in April 2020, with no difference between number of reported diagnoses and number of records initiated during these months (Table 3). In contrast, the number of records marked completed by registrars increased significantly between March and April 2020, demonstrating continuous ability to process data (Table 3). There were no statistically significant changes in number of cases processed with diagnosis, initiation, or completion date after May 2020.

Fig. 2
figure 2

Number of records diagnosed, initiated, and completed by cancer registrars during the COVID-19 pandemic, NCDB 2019–2020

Table 3 Comparison between number of cancer cases diagnosed and number of cancer records processed by cancer registrars in NCDB-participating facilities during the initial months of the COVID-19 pandemic

Discussion

There was a significant decrease in the number of cancer cases reported to the NCDB in 2020, and this deficit was not due to cancer registrars’ inability to process cancer data during the first year of the COVID-19 pandemic. Therefore, NCDB data can be reliably used to evaluate the impact of the pandemic on cancer care and outcomes.

Patterns in decline were similar to those reported elsewhere, with greatest deficit observed in April 2020.1,3,9 Number of cancer diagnoses did not exceed expected number of cancer cases through the end of 2020. This is consistent with the finding that the deficit in new diagnoses in April 2020 was not made up by increased diagnoses in the months that followed (in which case more than expected numbers would have been seen in May–December 2020).

The deficit in number of cancer cases reported to the NCDB in 2020 was significant for all cancer sites combined and for each of the 21 selected cancer sites. The most commonly diagnosed cancer sites in the US had the greatest absolute decrease in number of cases reported to the NCDB in 2020.10 Cancer sites often diagnosed and/or treated at outpatient settings, such as prostate cancer and melanoma, which have been previously reported to have the lowest case coverage in NCDB,5 and for which clinical guidelines have recently changed in an attempt to mitigate overdiagnosis (prostate and thyroid),10,11,12 had the greatest percentage deficit in cases.

The significant decrease in number of cancer diagnoses reported to the NCDB in 2020 was not due to registrars’ inability to process records during the pandemic. As expected, both the number of cancer records with diagnosis dates and the number of new cancer records created by registrars (initiation dates) in March and April of 2020 decreased. In contrast, the number of records with completion dates in March and April of 2020 increased, reflecting the ability of registrars to process cancer data during the first two months of the pandemic.

This study has limitations. Although the NCDB captures over 70% of individuals newly diagnosed with cancer in the US and cancer cases captured by NCDB closely resemble those of population-based cancer registries,5 NCDB is not population-based. Additionally, it is possible that Commission on Cancer (CoC)-accredited facilities reporting to NCDB have increased access to resources and were better able to abstract and report cancer cases during the pandemic than non-accredited facilities. Future studies should evaluate the impact of the pandemic on reliability of data collected by other national cancer registries.

Taken together, our results demonstrate that the decline in the number of cancer records reported to the NCDB in 2020 accurately reflect the impacts of the pandemic on cancer care and were not biased by changes in data collection, as cancer registrars in CoC-accredited institutions were able to process cancer data during the pandemic. Future studies should evaluate how the pandemic, its associated community mitigation measures, and regional variation in care impacted cancer diagnosis, care, and outcomes.