Introduction

This study assesses the feasibility of using hospital discharge data to identify women with incident breast cancer, and it evaluates the use of these data to assess compliance with an indicator of quality of care. The study’s goals are to: 1) compare breast cancer case-finding algorithms based upon ICD-9 CM codes found in hospital discharge data with cancer registry data and 2) compare the utility of these algorithms for use with a selected quality indicator.

Methods

Women aged 20 or greater who had continuous residence in the Emilia- Romagna Region, Italy, between 2002 and 2005 (N=1,887,212) were included in the analysis. The regional breast cancer tumor registry was used as the “gold standard” in identifying incident cases. Algorithms using ICD-9 CM codes on the Italian hospital discharge data were developed to identify women with incident breast cancer requiring a diagnosis code for cancer, as well as a surgical code for lumpectomy or mastectomy.

These hospital discharge-based algorithms were then compared to the registry data. Sensitivity, specificity and positive predictive value were computed overall and by age and cancer stage. These two case-finding approaches were also used to evaluate the use of radiation therapy one year post diagnosis for eligible women.

Results

15,469 women were identified by the regional cancer registry as incident cases, while 15,218 women were identified by the hospital discharge data algorithm. Overall sensitivity was 87.5%, specificity was 99.9%, and positive predictive value (PPV) was 89.0%. Of the 1,677 who were false positives, 1,060 (64%) were women who had an incident case prior to 2000. There were 1,928 women who were false negatives. Many of these women were older and did not receive a surgical procedure. Therefore, they were not included in our hospital data-based, case-finding algorithm.

Sensitivity declined as the patient population became older; however, we observed relatively good PPV for all age groups. For the radiation therapy quality indicator, we identified 5,847 women using the cancer registry who were included as the population at risk (denominator), while hospital data-based algorithms identified 7,262 women. The estimated overall compliance with the radiation therapy indicator using the registry was 90.2%, compared to 78.8% using the hospital-based case-finding algorithm. This lower rate for the hospital-based population was chiefly due to a larger identified population at risk (false positives).

Conclusions

This project confirms the feasibility of using routinely available hospital discharge abstract data to identify women with incident breast cancer with good sensitivity, specificity, and positive predictive value when compared to cancer registry data. These hospital claims-based algorithms allow for analyses of large populations for which registry data may not be routinely available, thus allowing for the study of subsequent patterns and outcomes of care.