Abstract
Population-based cancer registry data from the Surveillance, Epidemiology, and End Results (SEER) Program at the National Cancer Institute are based on medical records and administrative information. Although SEER data have been used extensively in health disparities research, the quality of information concerning race, Hispanic ethnicity, and immigrant status has not been systematically evaluated. The quality of this information was determined by comparing SEER data with self-reported data among 13,538 cancer patients diagnosed between 1973–2001 in the SEER—National Longitudinal Mortality Study linked database. The overall agreement was excellent on race (κ = 0.90, 95% CI = 0.88–0.91), moderate to substantial on Hispanic ethnicity (κ = 0.61, 95% CI = 0.58–0.64), and low on immigrant status (κ = 0.21. 95% CI = 0.10, 0.23). The effect of these disagreements was that SEER data tended to under-classify patient numbers when compared to self-identifications, except for the non-Hispanic group which was slightly over-classified. These disagreements translated into varying racial-, ethnic-, and immigrant status-specific cancer statistics, depending on whether self-reported or SEER data were used. In particular, the 5-year Kaplan–Meier survival and the median survival time from all causes for American Indians/Alaska Natives were substantially higher when based on self-classification (59% and 140 months, respectively) than when based on SEER classification (44% and 53 months, respectively), although the number of patients is small. These results can serve as a useful guide to researchers contemplating the use of population-based registry data to ascertain disparities in cancer burden. In particular, the study results caution against evaluating health disparities by using birthplace as a measure of immigrant status and race information for American Indians/Alaska Natives.
Similar content being viewed by others
References
Clegg LX, Li FP, Hankey BF, Chu K, Edwards BK (2002) Cancer survival among US whites and minorities: a SEER (Surveillance, Epidemiology, and End Results) program population-based study. Arch Intern Med 162:1985–1993
Harlan LC, Clegg LX, Trimble EL (2003) Trends in surgery and chemotherapy for women diagnosed with ovarian cancer in the United States. J Clin Oncol 21:3488–3494
Hedeen AN, White E (2001) Breast cancer size and stage in Hispanic American women, by birthplace: 1992–1995. Am J Public Health 91:122–125
Hedeen AN, White E, Taylor V (1999) Ethnicity and birthplace in relation to tumor size and stage in Asian American women with breast cancer. Am J Public Health 89:1248–1252
Jemal A, Clegg LX, Ward E et al (2004) Annual report to the nation on the status of cancer, 1975–2001, with a special feature regarding survival. Cancer 101:3–27
Miller BA, Kolonel LN, Bernstein L et al (1996) Racial/ethnic cancer patterns in the United States 1988–1992. Bethesda, MD, National Cancer Institute, NIH Publ. No. 96–4104
Ries LAG, Eisner MP, Kosary CL et al (2005) SEER Cancer Statistics Review, 1975–2002: National Cancer Institute. Bethesda, MD, http://www.seer.cancer.gov/csr/1975_2002/
Haynes MA, Smedley BD (1999) The unequal burden of cancer: an assessment of NIH research and programs for ethnic minorities and the medically underserved. National Academy Press, Washington, DC
Thomson GE, Mitchell F, Williams M (2006) Examining the health disparities research plan of the National Institutes of Health: Unfinished business. National Academy of Sciences, Washington, DC
U.S. Cancer Statistics Working Group (2005) United States Cancer Statistics: 1999–2002 Incidence and Mortality Atlanta: Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute. Available at: www.cdc.gov/cancer/npcr/uscs
Stewart SL, Swallen KC, Glaser SL, Horn-Ross PL, West DW (1999) Comparison of methods for classifying Hispanic ethnicity in a population-based cancer registry. Am J Epidemiol 149:1063–1071
Swallen KC, Glaser SL, Stewart SL, West DW, Jenkins CN, McPhee SJ (1998) Accuracy of racial classification of Vietnamese patients in a population-based cancer registry. Ethn Dis 8:218–227
Swallen KC, West DW, Stewart SL, Glaser SL, Horn-Ross PL (1997) Predictors of misclassification of Hispanic ethnicity in a population-based cancer registry. Ann Epidemiol 7:200–206
Gomez SL, Glaser SL. Misclassification of race/ethnicity in a population-based cancer registry. Cancer Cause and Control (in press)
Gomez SL, Glaser SL (2004) Quality of birthplace information obtained from death certificates for Hispanics, Asians, and Pacific Islanders. Ethn Dis 14:292–295
Gomez SL, Glaser SL, Kelsey JL, Lee MM (2004) Bias in completeness of birthplace data for Asian groups in a population-based cancer registry (United States). Cancer Causes Control 15:243–253
The Surveillance, Epidemiology, and End Results Program. About SEER (2006) Available at: http://www.seer.cancer.gov/about/. Accessed on: March 22
Hankey BF, Ries LA, Edwards BK (1999) The surveillance, epidemiology, and end results program: a national resource. Cancer Epidemiol Biomarkers Prev 8:1117–1121
U.S. Census Bureau (2006) Current Population Survey. Technical Paper 63RV: Design and Methodology. Available at: http://www.census.gov/prod/2002pubs/tp63rv.pdf. Accessed on: March 22
U.S. Census Bureau (2006) Current Population Survey. Source and Accuracy of the Data for the March 200 Current Population Survey Microdata File. Available at: http://www.bls.census.gov/cps/ads/2002/S&A_02.pdf. Accessed on: March 22
National Longitudinal Mortality Study (NLMS) (2006) Project Description. Available at: http://www.census.gov/nlms/projectDescription.html. Accessed on: March 22
U.S. Census Bureau (2006) Confidentiality. Available at: http://www.census.gov/main/www/policies.html#confidential. Accessed on: March 22
Rogot E, Sorlie P, Johnson NJ (1986) Probabilistic methods in matching census samples to the National Death Index. J Chronic Dis 39:719–734
Calle EE, Terrell DD (1993) Utility of the National Death Index for ascertainment of mortality among cancer prevention study II participants. Am J Epidemiol 137:235–241
Landis JR, Koch GG (1977) The measurement of observer agreement for categorical data. Biometrics 33:159–174
Kaplan EL, Meier P (1958) Non parametric estimation from incomplete observations. J Am Stat Assoc 53:457–481
Kelly JJ, Chu SY, Diaz T, Leary LS, Buehler JW (1996) Race/ethnicity misclassification of persons reported with AIDS. The AIDS Mortality Project Group and The Supplement to HIV/AIDS Surveillance Project Group. Ethn Health 1:87–94
Kressin NR, Chang BH, Hendricks A, Kazis LE (2003) Agreement between administrative data and patients’ self-reports of race/ethnicity. Am J Public Health 93:1734–1739
Gomez SL, Le GM, West DW, Satariano WA, O’Connor L (2003) Hospital policy and practice regarding the collection of data on race, ethnicity, and birthplace. Am J Public Health 93:1685–1688
Polednak AP (2001) Agreement in race-ethnicity coding between a hospital discharge database and another database. Ethn Dis 11:24–29
Blustein J (1994) The reliability of racial classifications in hospital discharge abstract data. Am J Public Health 84:1018–1021
Boehmer U, Kressin NR, Berlowitz DR, Christiansen CL, Kazis LE, Jones JA (2002) Self-reported vs administrative race/ethnicity data and study results. Am J Public Health 92:1471–1472
Gomez SL, Kelsey JL, Glaser SL, Lee MM, Sidney S (2005) Inconsistencies between self-reported ethnicity and ethnicity recorded in a health maintenance organization. Ann Epidemiol 15:71–79
Polednak AP (2005) Collecting information on race, Hispanic ethnicity, and birthplace of cancer patients: policies and practices in Connecticut hospitals. Ethn Dis 15:90–96
Frost F, Taylor V, Fries E (1992) Racial misclassification of Native Americans in a Surveillance, Epidemiology, and End Results cancer registry. J Natl Cancer Inst 84:957–962
Hahn RA, Wetterhall SF, Gay GA et al (2002) The recording of demographic information on death certificates: a national survey of funeral directors. Public Health Rep 117:37–43
Lin SS, Clarke CA, O’Malley CD, Le GM (2002) Studying cancer incidence and outcomes in immigrants: methodological concerns. Am J Public Health 92:1757–1759
Hahn RA, Stroup DF (1994) Race and ethnicity in public health surveillance: criteria for the scientific use of social categories. Public Health Rep 109:7–15
Hahn RA, Truman BI, Barker ND (1996) Identifying ancestry: The reliability of ancestral identification in the United States by self, proxy, interviewer, and funeral director. Epidemiology 7:75–80
Kaplan JB, Bennett T (2003) Use of race and ethnicity in biomedical publication. Jama 289:2709–2716
Sugarman JR, Holliday M, Ross A, Castorina J, Hui Y (1996) Improving American Indian Cancer Data in the Washington State Cancer Registry using linkages with the Indian Health Service and Tribal Records. Cancer 78:1564–1568
Becker TM, Bettles J, Lapidus J et al (2002) Improving cancer incidence estimates for American Indians and Alaska Natives in the Pacific Northwest. Am J Public Health 92:1469–1471
Partin MR, Rith-Najarian SJ, Slater JS, Korn JE, Cobb N, Soler JT (1999) Improving cancer incidence estimates for American Indians in Minnesota. Am J Public Health 89:1673–1677
Acknowledgment
The authors would like to thank the editor and the reviewer for their valuable comments and suggestions that led a significant improvement of the original manuscript.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Clegg, L.X., Reichman, M.E., Hankey, B.F. et al. Quality of race, Hispanic ethnicity, and immigrant status in population-based cancer registry data: implications for health disparity studies. Cancer Causes Control 18, 177–187 (2007). https://doi.org/10.1007/s10552-006-0089-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10552-006-0089-4