Estimating the prevalence of hematological malignancies and precursor conditions using data from Haematological Malignancy Research Network (HMRN)

Well-established cancer registries that routinely link to death registrations can estimate prevalence directly by counting patients alive at a particular point in time (observed prevalence). Such direct methods can only provide prevalence for the years over which the registry has been operational. Time-defined estimates, including 5- and 10-year prevalence, may however underestimate the total cancer burden, and compared with other cancers, there is a lack of accurate information on the total prevalence of hematological malignancy subtypes. Accordingly, we aimed to estimate prevalence (observed and total prevalence) of hematological malignancies and precursor conditions by clinically meaningful subtypes using data from the UK’s specialist population-based register, the Haematological Malignancy Research Network (www.hmrn.org). Observed and total prevalences were estimated from 15,810 new diagnoses of hematological malignancies from 2004 to 2011 and followed up to the 31 August 2011 (index data). Observed prevalence was calculated by the counting method, and a method based on modelling incidence and survival was used to estimate total prevalence. Estimates were made according to current disease classification for the HMRN region and for the UK. The overall observed and total prevalence rates were 281.9 and 548.8 per 100,000, respectively; the total number of observed and total prevalent cases in the UK was estimated as 165,841 and 327,818 cases, as expected variation existed by disease subtype reflecting the heterogeneity in underlying disease incidence, survival and age distribution of hematological cancers. This study demonstrates the importance of estimating ‘total’ prevalence rather than ‘observed’ prevalence by current disease classification (ICD-O-3), particularly for subtypes that have a more indolent nature and for those that are curable. Importantly, these analyses demonstrate that relying on observed prevalence alone would result in a significant underestimation of the relative burden of some subtypes. While many of these cases may be considered cured and no longer being actively treated, people in this survivorship phase may have long-term medical needs and accordingly, it is important to provide accurate counts to allow for healthcare planning.


Introduction
Cancer prevalence may be defined as the proportion of people in a population who have ever received a cancer diagnosis in the past and who are alive on a specified date-the index date. Cancer prevalence, which is generally estimated using data from cancer registries [1], provides information on the healthcare needs of cancer patients who are on long-term medication and/or who are being monitored at regular intervals. In addition, for cancers that can be cured, prevalence is used to estimate the size of the survivor population.
Well-established cancer registries that routinely link to death registrations can estimate prevalence directly by counting patients alive at a particular point in time. Such direct methods can, however, only provide prevalence for the years over which the registry has been operational: the term observed prevalence often being used to describe estimates derived from registries that have been established for shorter periods. In such cases, it is common practice to quote the length over which the registry has been functioning alongside the observed prevalence estimate. Likewise, 5-and 10-year prevalence estimates are commonly used to assess cancer burden: The former estimating patients diagnosed and ascertained in the previous 5 years and the latter in the previous 10 years. Time-defined prevalence estimates may, however, underestimate the total cancer burden, and in order to provide better guidance for healthcare planning, a variety of methods have been developed to estimate total prevalence (the proportion of the population alive on the index date who have ever received a diagnosis of the cancer). Total prevalence is usually estimated using models that incorporate incidence and survival [2][3][4][5][6][7].
Compared with other cancers, there is a lack of accurate information on the total prevalence of clinically meaningful hematological malignancy subtypes. This is partly because these complex cancers are diagnosed using a combination of histology, cytology, immunophenotype, cytogenetics, imaging and clinical data, and this range and depth of data are difficult for cancer registries to access systematically [8,9]. Hence, the broad ICD-10 classification (leukemia, Hodgkin lymphoma, non-Hodgkin lymphoma and myeloma) has continued to be applied by many national registries [10], including the UK's National Cancer Intelligence Network, the USA's Surveillance, Epidemiology and End Results Program and the WHO's International Agency for Research on Cancer [11][12][13][14].
Accordingly, the present study aims to estimate prevalence (observed prevalence and total prevalence) of hematological malignancies for clinically meaningful subtypes using data from the UK's specialist populationbased register, the Haematological Malignancy Research

Calculating prevalence
The index date for prevalence calculations was taken to be 31 August 2011. Observed prevalence was calculated directly by counting the number of survivors newly diagnosed with a hematological cancer from 1 September 2004 to 31 August 2011, alive on the index date (31 August 2011). Total prevalence for each subtype was derived by applying an estimated correction factor, the completeness index (R), defined as the proportion of the total prevalence represented by the observed prevalence as described by Capocaccia and Angelis [17]. The calculation of the completeness index is based on incidence and survival models [17]. To accommodate the characteristics of hematological malignancies, a regression spline was used to estimate incidence for single ages. This nonparametric method makes a smoothing curve, which is not sensitive to the assumptions made for a parametric incidence function. For survival models, the Weibull function has previously been successfully applied to prevalence estimates [17][18][19]. Based on the Weibull function, the influence of age at diagnosis was described using spline and modelled with an exponential factor of survival function.
As the age and sex structure in the HMRN region mirrors that of the UK as a whole (Fig. 1), national prevalence was estimated by applying the HMRN rates to the UK population for both genders. All calculations were conducted using Stata 11.0 and R 3.0.1 software.

Results
There were 15,810 diagnoses of hematological malignancies from 2004 to 2011, of which 8,799 were among males (55.7 %) and 7,011 were females (44.3 %). The crude incidence rate for all hematological cancers combined was 63.2 per 100,000 per year and, as expected, incidence and survival varied by subtype: This variation is summarized in Table 1, where diagnoses are grouped according to their incidence magnitude (\2, 2-5,[5 per 100,000) and overall survival (\30, 30-70,[70 %). In addition to incidence and survival, age at diagnosis plays an important role in prevalence, and as with incidence and survival, hematological malignancies exhibit much greater variation than most other cancers. Indeed, with different subtypes dominating at different ages, hematological malignancy can be diagnosed at any age; the median age at diagnosis ranging from 15.3 years for acute lymphoblastic leukemia to 77.3 years for chronic myelomonocytic leukemia. Most subtypes had an older median diagnostic age (70.6 years for all hematological malignancies combined) (Fig. 2).
Observed and total prevalence estimates (per 100,000) together with completeness indices (R) are presented in Table 2. The overall observed prevalence rate was 281.9  Hodgkin lymphoma tend to be cured, resulting in a larger number of prevalent cases after middle age, while follicular lymphoma is rarely diagnosed before the age of 40 years. Observed and total prevalence estimates for the UK as a whole are presented in Figs. 3 (males) and 4 (females): observed prevalence (blue bars) and total prevalence (blue ? red bar). Hematological malignancy subtypes are ranked in order of descending total prevalence. In total, the observed prevalence was estimated to be 165,841 cases and total prevalence 327,818 cases. Table 3 lists the UK observed and total prevalence estimates of the top five most prevalent hematological malignancies for males and females separately. Clearly, relying on observed prevalence alone would have resulted in a significant underestimation of the relative burden of some diseases. Observed prevalence, for example, ranks Hodgkin lymphoma as 6th for men and 8th for women, whereas total prevalence places it second for both genders. In other words, compared with observed prevalence, the relative contribution of Hodgkin lymphoma increases when longer prevalence periods are considered.

Discussion
This study is the first to estimate observed and total prevalence for hematological malignancies using up-todate clinically meaningful disease classifications. The results suggest that at any one time, around 19,700 people in the study region are likely to be living with a prior diagnosis of a hematological malignancy or a recognized precursor condition (monoclonal gammopathy of uncertain significance or monoclonal B cell lymphocytosis): In total, this equates to around 327,800 people in the UK. After calculating total prevalence, the most prevalent malignancy in men was chronic lymphocytic leukemia and Hodgkin lymphoma in women.
Established in 2004, the HMRN's population-based patient cohort provided an estimate of hematological malignancy prevalence that accounted for about half of the total (completeness index of 0.51). Consistent with expectations, the differences between total prevalence and observed prevalence estimates were typically seen in less fatal cancers that are commonly diagnosed at a younger age. For example, Hodgkin lymphoma generally has good survival, and total prevalence estimates exceed those of observed prevalence, while the difference between observed prevalence and total prevalence is slight for mantle cell lymphoma which has generally poor survival. Again, as expected, large differences between observed and total prevalence were also seen for precursor conditions. Information on 3-, 5-and 10-year prevalence is available on the study's website (www.hmrn.org/statistics/ prevalence) and has been published for the lymphomas and myeloid malignancies [20,21].
Not only is the HMRN region similar to the UK as a whole in terms of its age and sex distribution, but it is also broadly similar by urban/rural and deprivation status [16]; accordingly, rates were not standardized by age and sex. Likewise, according to the 2011 [22] census, the proportion of HMRN's population classified as white was the same as the UK as a whole (87 %). However, some ethnic groups are underrepresented in the region primarily the black ethnic group. For some hematological malignancies, such as myeloma, incidence and survival have been shown to vary by ethnicity with higher rates of both in black ethnic groups [14,23,24]. Accordingly, HMRN rates may underestimate myeloma prevalence in areas of the country with a higher proportion of black people [25]. This study assumes that the survival rate was constant over time; however, for some subtypes, there has been dramatic changes in outcomes due to the introduction of new treatments, for example, the introduction of tyrosine kinase inhibitors has transformed the survival in chronic myelogenous leukemia (CML) from a fatal disease in nontransplanted patients to one where patients can now achieve a near normal life span [26]. While CML is a rare disease (1 per 100,000), the utilization of current survival rates may lead to an overestimate in prevalence; accordingly, methods to estimate total prevalence need to be adapted to account for changes in outcome due to the introduction of novel therapies.
The major aim of this study was to estimate the prevalence of hematological malignancies and precursor conditions for clinically relevant diagnostic groups and explore the impact of calculating observed and total prevalence by current disease classification. For some subtypes, calculating observed prevalence would lead to an underestimation of the prevalent population, as cases diagnosed prior to the establishment of HMRN will not be captured. While many of these cases may be considered cured and no longer being actively treated, people in this survivorship phase may have long-term medical needs, and accordingly, it is important to provide accurate counts to allow for healthcare planning.

Conflict of interest No conflicts of interest declared.
Ethics statement Haematological Malignancy Research Network has ethics approval from Leeds West Research Ethics Committee, R&D approval from each NHS Trust in the study area and exemption from Section 251 of the Health and Social Care Act.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creative commons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.