DLBCL is the most common type of NHL, with an estimated 20,300 new cases in the United States in 2010 [13]. It is classified as an aggressive form of NHL [3] because survival is limited in the absence of effective treatment [4]. There have been substantial changes in the treatment of DLBCL in the past two decades. For instance, prior to the introduction of the monoclonal antibody rituximab (Genentech, South San Francisco, CA), the mainstay of treatment for DLBCL was CHOP (cyclophosphamide, doxorubicin, vincristine, and prednisone), with a three year overall survival of approximately 54% [5]. In 2002, a landmark study by Coiffier et al. demonstrated that rituximab plus CHOP (R-CHOP) significantly improved overall survival compared to CHOP alone in elderly patients with DLBCL [6]. Based on this study, and later studies that confirmed the findings [710], R-CHOP is now recommended as frontline therapy for most patients with advanced (Ann Arbor Stage III-IV) disease and many with localized (Ann Arbor stage I-II) disease [3].

The existence of racial and ethnic disparities in health care access and outcomes is well-documented. An Institute of Medicine report [11] found that even when access-related factors such as insurance status and the ability to pay for care are the same, African Americans and Hispanics tend to receive a lower quality of health care across a range of disease areas and clinical services, including cancer. According to the American Cancer Society, African Americans are more likely to develop and die from cancer than any other racial or ethnic group [1]. Not surprisingly, therefore, considerable attention has been focused on better understanding racial and ethnic disparities in cancer care and outcomes, including NHL [1214].

One recent study by Wang and colleagues examined ethnic variations in treatment and survival in patients diagnosed with NHL, including DLBCL and two indolent types of NHL: chronic lymphocytic leukemia (CLL); and follicular lymphoma (FL) [14]. This was a retrospective cohort study of patients diagnosed with NHL from 1992 to 1999, using the SEER-Medicare linked database. Based on multivariate analysis, the investigators found that among all patients diagnosed with NHL, blacks were significantly less likely than whites to receive chemotherapy (OR, 0.69; 95% confidence interval [CI], 0.50-0.95). The study found no difference in all-cause mortality between blacks and whites, either overall or in any of the specific types of NHL, including DLBCL.

Several changes in the treatment of NHL have occurred since the patients included in this study were diagnosed, including the introduction of rituximab, updated clinical practice guidelines, and ongoing pressures on the health system to reduce costs while improving outcomes. These may have impacted racial and ethnic disparities in access and outcomes in NHL, and in particular aggressive types such as DLBCL. For instance, the impact of racial and ethnic disparities in access to treatment for DLBCL, as documented by Wang and colleagues [14], could increase disparities in outcomes such as survival in the presence of more effective treatments such as R-CHOP. Therefore, the purpose of the present study was to identify factors associated with treatment and survival in a cohort of patients diagnosed with DLBCL during the period when rituximab was introduced, with a particular focus on race/ethnicity.


Data Source

The source of data for this study was the National Cancer Institute's (NCI) SEER cancer registry linked to Medicare enrollment and claims data. This database has been described in detail elsewhere [15]. Briefly, as of 2010, SEER collects and publishes cancer incidence and survival data from 18 population-based cancer registries throughout the United States covering approximately 26% of the US population [16]. SEER coverage includes 23% of African Americans, 40% of Hispanics, 42% of American Indians and Alaska Natives, 53% of Asians, and 70% of Hawaiian/Pacific Islanders.

The registries routinely collect data on patient demographics, primary tumor site, tumor morphology and stage at diagnosis, first course of treatment, and follow-up for vital status. In the SEER-Medicare data, cancer registry data are linked to Medicare enrollment and claims data. For persons age 65 years or older, 97% are eligible for Medicare, and 93% of patients in the SEER files are matched to the Medicare enrollment file [17]. Almost all Medicare beneficiaries have Part A coverage, which includes hospital, skilled nursing facility, hospice, and some home health care, and 96% of Part A beneficiaries choose to enroll in Part B of Medicare, which covers physician and outpatient services. At the time this study was performed, the SEER-Medicare linkage included all Medicare-eligible persons appearing in the SEER data through 2005 and their Medicare claims through 2007.

Patient Eligibility

Patients were included in this study if they were diagnosed with DLBCL between January 1, 2001 and December 31, 2005, and DLBCL was the first primary cancer diagnosed. Identification of DLBCL was made using two World Health Organization (WHO) International Classification of Diseases for Oncology, 3rd Edition (ICD-O-3) histology codes: 9680 (malignant lymphoma, large B-cell, diffuse, centroblastic, NOS) and 9684 (malignant lymphoma, large B-cell, diffuse, immunoblastic, NOS) [18, 19]. Patients were excluded for the following reasons: DLBCL diagnosed before age 65; diagnosis made by death certificate or autopsy; death within the first month following diagnosis; or Medicare enrollment less than 12 months before diagnosis. In addition, to ensure complete claims history, patients had to have been enrolled in both Medicare Parts A and B, with no health maintenance organization (HMO) coverage for 12 months prior to diagnosis. SEER reports date of diagnosis as month and year only. In this study, the first day of the SEER month of diagnosis was assigned as the day of diagnosis. After diagnosis, patients were followed until death, enrollment in an HMO, development of a second primary tumor, or the last date for which Medicare claims were available.


Medicare claims were used to identify the time of the first infused C-I therapy and radiation treatments provided to patients after diagnosis. International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) procedure codes [20], and Healthcare Common Procedure Coding System (HCPCS) codes [21], and revenue codes were used to identify C-I and radiation therapy from both inpatient and outpatient claims. A list of codes is included in the Appendix.

The first C-I therapy regimen was reconstructed from 30 days of claims after the first C-I therapy infusion. Patients were classified as taking either CHOP or CVP (cyclophosphamide, vincristine, and prednisone) by assuming the use of prednisone when the other agents were present, because oral medications with no intravenous equivalent were not available in SEER-Medicare at the time our study was conducted. The use of rituximab with these was classified as R-CHOP or R-CVP. If other chemotherapy agents were used, or patients had claims indicative of chemotherapy infusions without the HCPCS codes (i.e., J-codes) to identify the specific chemotherapy agents, these were classified as "other" with or without rituximab.

Mortality and Censoring

The date of death was assigned by using the Medicare date, if available, even in cases where the SEER date was also available. The Medicare date was preferred because it is more current than the SEER date [22]. In cases where the SEER date of death was available but missing for Medicare, the SEER date was used. The cause of death was classified as NHL or other/unknown cause based on ICD-9-CM codes. All other patients were assumed to be alive at the end of the analysis period (December 31, 2007), although they may have been censored earlier for other reasons, such as switching to HMO coverage.

Patient Characteristics

Patients were described according to their demographic, clinical, and socioeconomic characteristics. Patient age at diagnosis was stratified into four groups: 66 - 69; 70 - 74; 75 - 79; and ≥ 80. Requiring eligible patients to have at least one year of Medicare enrollment prior to diagnosis ensured that the minimum age in the cohort was 66 years. Race/ethnicity was defined using the SEER re-coded race variable in combination with the Medicare race variable as follows: black, if both variables indicated black race; white, if both variables indicated white race, or if the Medicare variable indicated white race and the SEER variable indicated Hispanic ethnicity (since SEER uses Hispanic surname to assign Hispanic ethnicity); and other, which in SEER consists predominantly of American Indian/Native Alaskan, Native Hawaiian or Other Pacific Islander, and Asian [23].

Summary staging is the approach SEER uses to categorize how far a cancer has spread from its point of origin [24]. It uses all information available in the medical record, and is a combination of the most precise clinical and pathological documentation of the extent of disease. DLBCL is classified as Stage I-IV according to the number of lymph node regions, the location of those regions, involvement of the spleen, and involvement of extra-lymphatic organs/sites. With the exception of Stage IV disease, in which all patients have multifocal involvement, patients classified as Stage I-III may or may not have extra-nodal involvement. Consequently, patients also were classified according to whether their disease was confined to one or more lymph node regions (nodal), or involved the spleen or an extra-lymphatic organ or site (extra-nodal). Finally, they were classified according to the histologic subtype of DLBCL: centroblastic (ICD-O-3 code 9680); or immunoblastic (ICD-O-3 code 9684) [18, 19].

We used the Medicare inpatient (Part A), outpatient, and physician (Part B) records to calculate an NCI Comorbidity Index score for each patient [25]. This approach [26, 27] entails first removing claims that are considered to have unreliable diagnosis coding, such as those for testing procedures used to rule out conditions. Then, remaining diagnosis and procedure codes are used to identify the 15 non-cancer comorbidities in the Charlson Comorbidity Index (CCI) [28]. The algorithms used to identify these conditions reflect the Deyo [29] adaptation of the CCI, and include several procedure codes from the Romano [30] adaptation. A weight is assigned to each condition, and the weights are summed to obtain the index for each patient. Medicare inpatient and outpatient claims, excluding those likely to be made only for testing purposes, were used to identify the presence of anemia, neutropenia, thrombocytopenia, and other cardiovascular disease prior to diagnosis. These are not included in the NCI Comorbidity Index.

Socioeconomic information at the patient level is not available through SEER-Medicare. Instead, the SEER-Medicare dataset contains information from the 2000 Census, reported at the tract level in which the patient lives, for the percent of the population living in poverty and the percent of those age 25 years or older with some college. We used these as indicators of the socioeconomic status of individual patients in the DLBCL cohort. SEER registry (consolidated into 13 regions, with California as a single region, and excluding Arizona Native Americans) and the assigned metropolitan statistical area as recoded by SEER (big metropolitan, metropolitan, urban, less urban, and rural) were used as geographic indicators.

Statistical Analysis

Cox proportional hazards regression was used to identify factors associated with treatment and survival. Both the time to treatment and survival analyses were stratified by Stage I-II versus III-IV at diagnosis. In addition to time-to-treatment analyses, logistic regression was used to identify factors associated with receiving therapy within 180 days following diagnosis. Survival analysis was conducted with three different mortality endpoints: all-cause mortality; NHL mortality; and other/unknown cause mortality. The base case multivariate survival analyses were performed without treatment as a covariate, such that any racial disparities in access to treatment that also impacted survival were captured in the race covariates in the survival analyses. We then repeated all the multivariate survival analyses with treatment included as a time dependent covariate. We reasoned that comparing the race covariates from models with versus without treatment included would provide insight into the impact racial disparities in access to treatment had on racial disparities in survival.

Editorial Note

Notification of all NCI approvals for this study was obtained from IMS, Inc., via email correspondence, on July 29th, 2008. At the time this study was approved, NCI did not require Institutional Review Board approval prior to releasing SEER-Medicare data. However, since the SEER-Medicare data contain information about geographic location at the county level, as well as dates of receiving health care services, the SEER-Medicare data are considered by Health Insurance Portability and Accountability Act (HIPAA) requirements as a limited data set, which requires that all investigators sign a Data Use Agreement (DUA) prior to receiving the data. The DUA for this study was executed by Dr. Danese on June 7, 2008.


We identified 7,048 patients who met the study eligibility criteria (Table 1). The median age at diagnosis was 77 years, 54% were female, 88% were white, and 43% had Stage III or IV disease at diagnosis. Overall, 5,887 (84%) received C-I therapy or radiation treatment during the observation period: 5,555 (94%) received C-I therapy with (1,826: 31%) or without radiation (3,729: 63%); and the remainder (332: 6%) received radiation alone (not shown in Table). Among those who received C-I therapy (5,555), 46% (2,569) received chemotherapy alone, 45% (2,488) received rituximab plus chemotherapy, and the remainder (498:9%) received rituximab alone. R-CHOP was the most common chemotherapy regimen (2,167: 39%).

Table 1 Patient Characteristics

As shown in Figure 1, the median time to first treatment following DLBCL diagnosis was 42 days, and 92% (5,398/5,887) of the patients receiving treatment began within 180 days following diagnosis. The unadjusted time to beginning treatment was 10 days longer for blacks versus whites. In multivariate analysis of time to treatment using Cox regression, the treatment rate was significantly lower among patients ≥ 80 years old, blacks versus whites, those living in a census tract with ≥ 12% poverty, and extra-nodal disease. The treatment rate was higher in those diagnosed with later-stage disease (Table 2). In stratified analyses, the treatment rate was lower in blacks than whites among Stage I-II and among Stage III-IV patients. Findings from the logistic regression analysis of treatment within the first 180 days following treatment were consistent with those from the Cox regression analyses (Table 2), except that generally the effect sizes were larger in the logistic regression analysis. For instance, in the logistic regression model, the OR for treatment among blacks versus whites was 0.63 compared to a HR of 0.77 in the Cox model that included all patients.

Figure 1
figure 1

Time to Treatment, by Race.

Table 2 Multivariate Analysis of Treatment

There were 4,188 (59% of the population) deaths during the observation period: 2,366 (56%) had NHL listed as the cause of death, and the remaining 1,822 (44%) had another cause listed (919: 50%) or the cause of death was not recorded (903: 50%). The median survival was two years, and 95% survived at least six weeks following diagnosis, which, as reported above, was also the median time to initial treatment. In multivariate survival analysis using Cox regression, which did not include treatment as a covariate, older age, male gender, black race, immunoblastic histology, advanced stage at diagnosis, higher NCI Comorbidity Index, anemia, cardiovascular disease, and living in a census area with ≥ 12% poverty all were associated with higher all-cause mortality (Table 3). Being diagnosed later in the observation period was associated with lower all-cause mortality. Black race was associated with higher mortality due to other/unknown causes but it was not associated with higher mortality due to NHL. In the NHL mortality model, the effect sizes were larger for histology, Stage, year of diagnosis, and anemia than in the other/unknown cause mortality model. In the other/unknown cause mortality model, effect sizes were larger for age, NCI Comorbidity Index, cardiovascular disease, and poverty.

Table 3 Multivariate Survival Analysis

In multivariate analyses stratified by Stage I-II and III-IV (Table 4), which did not include treatment as a covariate, black race was associated with significantly higher all-cause mortality among Stage III-IV patients, but not among Stage I-II patients. The effect sizes for age, gender, NCI Comorbidity Index, and cardiovascular disease were larger in the model of Stage I-II patients than the model of Stage III-IV patients, whereas the effect size for year of diagnosis was larger in the model of Stage III-IV patients.

Table 4 Multivariate Survival Analysis - All-Cause Mortality, by Stage

In multivariate survival analyses that did include treatment (Yes/No) as a covariate, the associations between black race and all-cause mortality, and between black race and other/unknown cause mortality, remained statistically significant (HR = 1.19, P = 0.04; and HR 1.27, P = 0.05, respectively). In the multivariate analyses stratified by Stage, black race remained significant for patients diagnosed with Stage III-IV disease (HR = 1.28, P = 0.03). Treatment was statistically significant in all five models.


We conducted a study using SEER-Medicare to identify factors associated with treatment and mortality in an elderly cohort of patients diagnosed with DLBCL. In particular, we sought to understand whether racial differences in treatment observed in an earlier study of SEER-Medicare data [14] were present in more recent SEER-Medicare data. Also, we reasoned that any observed differences in treatment might now translate into greater differences in survival following the introduction of rituximab, which, when added to CHOP, has been shown to improve overall survival in aggressive NHL [610]. Our findings show that blacks were less likely to receive treatment than whites. In multivariate analysis of time to initial treatment, the adjusted rate of treatment was 23% lower in blacks than whites. Furthermore, blacks were 37% less likely than whites to begin treatment within the first 180 days following diagnosis. The difference between the black race coefficients in these two models, as well as the Kaplan-Meier curves illustrating time to first treatment, suggest that while there were persistent differences in treatment rates during the entire observation period, the differences were greatest within the first six months following diagnosis. This is of considerable concern in DLBCL since it is an aggressive type of NHL where frontline treatment with C-I therapy is now recommended in most instances [3].

We next examined factors associated with mortality. In the multivariate survival analysis, black race was associated with 24% higher all-cause mortality, adjusting for demographic and clinical covariables. When we divided the cause of death into that recorded as NHL versus other/unknown, we found that black race was associated with 35% higher mortality due to other/unknown causes. However, black race was not associated with statistically significantly higher NHL mortality. When we compared these two models, we found that several of the cancer covariables, e.g. Stage, histology, and anemia, had larger effects in the NHL mortality model than in the other/unknown cause mortality model. Also, the effect of year of diagnosis was greater in the NHL model than in the other/unknown cause mortality model, perhaps reflecting improvements in treatment of DLBCL over time. In contrast, several of the demographic and general comorbidity variables, e.g. gender, NCI Comorbidity Index, poverty, and cardiovascular disease, had larger effects in the other/unknown cause mortality model than the NHL model. When we stratified the analysis of all-cause mortality by disease Stage at diagnosis, we found that black race was associated with 35% higher mortality in Stage III-IV patients, but not with statistically significantly higher mortality in Stage I-II patients. In general, the cancer covariables had greater effects in the Stage III-IV model than in the Stage I-II model, and the opposite was true for the demographic and general comorbidity variables. It is interesting to note that when treatment was added as a covariate to the survival models, the results pertaining to black race were consistent with those in the base case analyses, except that the coefficients were smaller in the models that included treatment. This suggests that poorer access to treatment partially, but not fully, explains the disparities in survival we observed.

While our findings suggest that racial differences in all-cause mortality are due primarily to causes other than NHL, it is important to interpret the findings in the context of several limitations of our study. First, in SEER, the cause of death is obtained from state death certificates, and the underlying cause as coded by state health departments is accepted [22]. There is no cause of death listed on the Medicare side of SEER-Medicare, and it is therefore acknowledged that cause of death is inherently less reliable than other SEER variables [22]. In our study, 50% of patients who died from other/unknown causes had no cause of death assigned. It is possible that some of these patients died of causes related to NHL. This might have artificially inflated the actual association between black race and other/unknown cause mortality. We are not aware of any studies that validate the cause of death as recorded in SEER for patients with NHL. However, a recent study showed that cause of death coding for colon cancer in SEER had an estimated validity of 94.6% [31]. Of note, the estimated validity was lower for blacks (84.4%) compared to whites (95.4%), suggesting that any misclassification in our study could have impacted black patients disproportionately. If fewer blacks than whites were correctly assigned NHL as the cause of death, this could have artificially deflated the actual association between black race and NHL mortality in our analysis.

Second, observed disparities in cancer outcomes among racial and ethnic minorities can reflect obstacles to receiving health care services, including prevention, early detection, and high quality treatment [1]. Although one overriding factor is poverty [1], disparities have been shown to exist even when access-related factors such as insurance status and the ability to pay for care are the same across racial and ethnic groups [11]. Our study was conducted in a cohort of SEER-Medicare patients who had Medicare Part A and Part B insurance for at least one year before DLBCL diagnosis and throughout the observation period following diagnosis. Furthermore, we included measures of income and education in our multivariate treatment and survival models. Patient selection and the inclusion of income and education variables should have reduced the impact of insurance status and ability to pay as access-related factors in our analyses. However, we were unable to account for potential differences in Part B and Part D supplemental insurance (to cover Part B copayments) in our analyses. Also, we were unable to account for disparities in health insurance prior to Medicare eligibility, which could have affected access to health services, comorbidity, and severity of DLBCL at diagnosis in ways we could not adjust for in our analyses. Moreover, since SEER-Medicare does not report income and education levels for individual patients, we relied on census tract-level data.

Third, our study was based on the same data set used by Wang and colleagues in their earlier study [14], and our approaches were similar. Although both studies showed racial disparities in treatment, only ours also showed a difference in mortality. However, since we did not have access to SEER-Medicare data for the earlier time period in which Wang and colleagues conducted their study, we cannot conclude that the racial disparities in mortality we observed indicate a fundamental change from the earlier period. Rather, it could reflect differences in the patient population or analytic approach between our study and that of Wang and colleagues [14].

Finally, our study included a cohort of very elderly patients, with a median age of 77 years. As shown, mortality was high in this population, with only 50% surviving beyond two years. As a result, our findings may not be applicable to younger patients who have a lower risk of mortality overall. Certainly, studies of current treatment modalities for DLBCL would suggest survival is considerably better in younger patients [6].


Our findings show that the treatment rate was lower and the mortality rate was higher in black compared to white patients diagnosed with DLBCL. It is likely that the observed differences in mortality between blacks and whites are due to a number of factors, including differences in cancer and non-cancer related morbidity, as well as differences in treatment.


Table 5 Codes for Identifying Chemo-Immunotherapy and Radiation from Medicare Inpatient and Outpatient Claims