Introduction

Breast cancer is the most common form of cancer affecting women. One in eight to one in 12 women will develop the disease in their lifetime in the developed world. Every year over 44,000 women develop the disease in the UK (population 61 million) and more than 12,500 die from it [1].

While the widely quoted general population risk of breast cancer (one in eight to one in 12) is a lifetime risk, the risk in any given decade is never greater than one in 30. Furthermore, the proportion of all female deaths due to breast cancer per decade is never greater than 20%. The proportion is greatest in middle age, from 35 to 55 years, with cardiovascular deaths exceeding breast cancer deaths at all older ages and lung cancer causing more cancer deaths in women in the age group 60–85 years [2]. These comparisons underline the need for risk models for breast cancer and also the need to put these risks in the perspective of other diseases.

The presence of a significant family history is a highly important risk factor for the development of breast cancer. Even at extremes of age, the presence of a BRCA1 mutation will confer much greater risk than population risks. For instance, a 25-year-old woman who carries a mutation in BRCA1 has a greater risk within the next decade than a woman aged 70 years from the general population. About 4–5% of breast cancer is thought to be due to inheritance of a dominant cancer-predisposing gene [3, 4]. While hereditary factors are virtually certain to play a part in a high proportion of the remainder, these are harder to evaluate at present, although genome-wide association studies are likely to unravel these in the next 10 years [5]. Except in very rare cases such as Cowden's disease [6], there are no phenotypic clues that help to identify those who carry pathogenic mutations. Evaluation of the family history therefore remains necessary to assess the likelihood that a predisposing gene is present within a family. Inheritance of a germline mutation or deletion in a predisposing gene results in early-onset, and frequently bilateral, breast cancer. Certain mutations also confer an increased susceptibility to other malignancies, such as cancers of the ovary, and sarcomas [79]. Multiple primary cancers in one individual or related early-onset cancers in a pedigree are highly suggestive of a predisposing gene. Indeed, we have recently shown that at least 20% of breast cancer patients aged 30 years and younger are due to mutations in the known high-risk genes BRCA1, BRCA2 and TP53 [10].

Although deaths from breast cancer have been decreasing in many western countries, the incidence of the disease is continuing to increase. In particular, breast cancer rates are rising rapidly in countries with an historically low incidence, making it presently the world's most prevalent cancer [2]. The increase in incidence is almost certainly related to changes in dietary and reproductive patterns associated with western lifestyles. These are not just a reflection of an ageing population obtaining extra surveillance, as age-specific risks in eastern countries adopting more western lifestyles are increasing [2]. Indeed, there is evidence from genetic studies in the United States, Iceland and the United Kingdom of a threefold increase in incidence not only in the general population, but also in those at the highest level of risk with BRCA1 and BRCA2 mutations in the past 80 years [1113]. There is a need not only to predict which women will develop the disease, but also to apply drug and lifestyle measures in order to prevent the disease.

Types of risk assessment

There are two main types of risk assessment:

  • The chances of developing breast cancer over a given timespan, including the lifetime.

  • The chances of their being a mutation in a known high-risk gene such as BRCA1 or BRCA2.

While some risk-assessment models are aimed primarily at solving one of the questions, many also have an output for the other. For instance, the BRCAPRO model is primarily aimed at assessing the mutation probability but can have an output to assess breast cancer risk over time. The Cuzick–Tyrer model was developed to assess breast cancer risk over time but does have a readout for BRCA1/2 probability for the individual. To assess breast cancer risks over time as accurately as possible, all known risk factors for breast cancer need to be assessed.

Risk factors

Family history of breast cancer in relatives

  • Age at onset of breast cancer.

  • Bilateral disease.

  • Degree of relationship (first or greater).

  • Multiple cases in the family (particularly on one side).

  • Other related early-onset tumours (for example, ovary, sarcoma).

  • Number of unaffected individuals (large families with many unaffected relatives will be less likely to harbour a high-risk gene mutation).

Hormonal and reproductive risk factors

Hormonal and reproductive factors have long been recognised to be important in the development of breast cancer. Prolonged exposure to endogenous oestrogens is an adverse risk factor for breast cancer [1422]. Early menarche and late menopause increase breast cancer risk as they prolong exposure to oestrogen and progesterone.

Long-term combined hormone replacement therapy treatment (> 5 years) after the menopause is associated with a significant increase in risk. However, shorter treatments may still be associated with risk to those with a family history [15]. In a large meta-analysis the risk appeared to increase cumulatively by 1–2% per year, but to disappear within 5 years of cessation [16]. The risk from oestrogen-only hormone replacement therapy appears much less and may be risk neutral [1720]. Another meta-analysis also suggested there may be a 24% increase in the risk of breast cancer both during current use of the combined oral contraceptive and 10 years post use [14].

The age at first pregnancy influences the relative risk of breast cancer as pregnancy transforms breast parenchymal cells into a more stable state, potentially resulting in less proliferation in the second half of the menstrual cycle. As a result, early first pregnancy offers some protection, while women having their first child over the age of 30 have double the risk of women delivering their first child under the age of 20 years.

Hormonal factors may indeed have different effects on different genetic backgrounds. It has been suggested that in BRCA2 mutation carriers, for example, an early pregnancy does not confer protection against breast cancer [21]. Most studies are now, however, showing that risk factors in the general population have a similar effect on those women with BRCA1/2 mutations [22, 23].

Other risk factors

A number of other risk factors for breast cancer are being further validated. Obesity, diet and exercise are probably interlinked [24, 25]. Mammographic density is perhaps the single largest risk factor that is assessable but may have a substantial heritable component [26]. Other risk factors such as alcohol intake have a fairly small effect, and protective factors such as breast-feeding are also of small effect unless a number of years of total feeding have taken place. None of these factors are currently incorporated into available risk assessment models

Risk factors included

Current risk-prediction models are based on combinations of risk factors and have good overall predictive power, but are still weak at predicting which particular women will develop the disease. New risk-prediction methods are likely to result from examination of a range of high-risk genes as well as single nucleotide polymorphisms in several genes associated with lower risks [5]. These methods would be married in a prediction programme with other known risk factors to provide a far more accurate individual prediction.

At present, many of the known nonfamily-history risk factors are not included in risk models (Table 1). In particular, perhaps the greatest factor apart from age – mammographic density [26] – is not yet included. Further studies are in progress to determine whether inclusion of additional factors into existing models, such as mammographic density, weight gain [25] and serum steroid hormone measurements [26], will improve prediction. These are not straightforward additions as there may be significant interactions between risk factors and, although breast density is an independent risk factor for BRCA1 and BRCA2 cancer risk [25], the density itself may be heritable and may not increase risk in a similar way in the context of family history of breast cancer alone.

Table 1 Known risk factors and their incorporation into existing risk models

Breast cancer risk over time

Manual risk estimation

One of the best ways to assess risk is to consider the strongest risk factor, which in many assessment clinics is family history. If first-line risk can be assessed on this basis, then adjustments can be made for other factors [28, 29].

Manual breast cancer risk assessment is largely based on the published Claus risk tables [30] and use of data in clearcut BRCA1/2 families from penetrance data for breast cancer [31]. At the very least, a good manual assessment will alert the assessor to any spurious readout from a computer model.

Risk estimation models

Until recently the two most frequently used models were the Gail model and the Claus model.

Gail model

Gail and colleagues [32, 33] described a risk-assessment model that focuses primarily on nongenetic risk factors, with limited information on family history. The model is an interactive tool designed by scientists at the National Cancer Institute and at the National Surgical Adjuvant Breast and Bowel Project to estimate a woman's risk of developing invasive breast cancer. The risk factors used were age at menarche, age at first live birth, number of previous breast biopsies, and number of first-degree relatives with breast cancer. A model of relative risks for various combinations of these factors was developed from case–control data from the Breast Cancer Detection Demonstration Project.

Individualised breast cancer probabilities from information on relative risks and the baseline hazard rate are generated. These calculations take competing risks and the interval of risk into account. The data depend on having periodic breast examinations. The Gail model was originally designed to determine eligibility for the Breast Cancer Prevention Trial, and has since been modified (in part to adjust for race) and made available on the National Cancer Institute website [34]. The model has been validated in a number of settings and probably works best in general assessment clinics, where family history is not the main reason for referral [27, 33, 35].

The major limitation of the Gail model is the inclusion of only first-degree relatives, which results in underestimating risk in the 50% of families with cancer in the paternal lineage and also takes no account of the age of onset of breast cancer. As such it performed less well in our own validation set from a family history clinic (Table 1), substantially underestimating risk overall and in most subgroups assessed [27].

Claus model

Claus and colleagues developed a risk model for familial risk of breast cancer in a large population-based, case–control study conducted by the Centers for Disease Control [3]. The data were based on 4,730 histologically confirmed breast cancer cases aged 20–54 years and on 4,688 controls who were frequency matched to cases on the basis of both geographic region and 5-year categories of age. Family histories were obtained through interviews with the cases and controls, regarding breast cancer in mothers and sisters.

The authors' segregation analysis provided evidence for the existence of a single rare autosomal dominant allele carried by one in 300 people leading to increased susceptibility to breast cancer. The effect of genotype on the risk of breast cancer was shown to be a function of a woman's age. Carriers of the risk allele were at greater risk at all ages, although the ratio of age-specific risks was greatest at young ages and declined steadily thereafter. The proportion of cases predicted to carry the allele was highest (36%) among cases aged 20–29 years. This proportion gradually decreased to 1% among cases aged 80 years or older. The cumulative lifetime risk of breast cancer for women who carried the susceptibility allele was predicted to be high, approximately 92%, while the cumulative lifetime risk for noncarriers was estimated to be 10% [3].

Three years after publication of the model, lifetime risk tables for most combinations of affected first-degree and second-degree relatives were published [30]. Although these do not give figures for some combinations of relatives (for example, mother and maternal grandmother), an estimation of this risk can be garnered using the mother–maternal aunt combination. An expansion of the original Claus model estimates breast cancer risk in women with a family history of ovarian cancer [36]. The major drawback of the Claus model is that it does not include any of the nonhereditary risk factors.

Concordance of the Gail and Claus models has been shown to be relatively poor, with the greatest discrepancies seen with nulliparity, multiple benign breast biopsies, and a strong paternal or first-degree family history [37, 38]. Indeed, a particular problem with the use of the Claus model is the discrepancy in results obtained when using the published tables [30], compared with computerised versions of the model [2729, 39]. While the tables make no adjustments for unaffected relatives, the computerised version is able to reduce the likelihood of the 'dominant gene' with an increasing number of affected women. The tables give consistently higher risk figures than the computer model, however, suggesting that either a population risk element is not added back into the calculation or that the adjustment for unaffected relatives is made from the original averaged figure rather than from assuming that each family will have already had an 'average' number of unaffected relatives. The latter appears to be the probable explanation, as inputting families with zero unaffected female relatives gives risk figures close to the Claus table figure.

Another potential drawback of the Claus tables is that they reflect risks for women in the 1980s in the USA. These are lower than the current incidence in both North America and most of Europe. As such, an upward adjustment of 3–4% for lifetime risk is necessary for lifetime risks below 20%. Our own validation of the Claus computer model showed that it substantially underestimated risks in the family history clinic. Manual use of the Claus tables, however, provided accurate risk estimation (Table 1) [27]. A modified version of the Claus model has now been validated as the 'Claus extended' model, by adding risk for bilateral disease, ovarian cancer and three or more affected relatives [40].

BRCAPRO model

Parmigiani and colleagues [41] developed a Bayesian model that incorporated published BRCA1 and BRCA2 mutation frequencies, cancer penetrance in mutation carriers, cancer status (affected, unaffected, or unknown), and age of the consultee's first-degree and second-degree relatives. An advantage of this model is that it includes information on both affected and unaffected relatives. In addition, it provides estimates for the likelihood of finding either a BRCA1 mutation or a BRCA2 mutation in a family. An output that calculates breast cancer risk using the likelihood of BRCA1/2 can be utilised [27]. None of the nonhereditary risk factors can yet be incorporated into the model (Table 1).

The major drawback from the breast cancer risk-assessment aspect is that no other 'genetic' element is allowed for. As such, this model will underestimate risk in breast-cancer-only families. The BRCAPRO model produced the least accurate breast cancer risk estimation from our family history clinic validation [27]. The model predicted only 49% of the breast cancers that actually occurred in the screened group of 1,900 women.

Cuzick–Tyrer model

Until recently, no single model integrated family history, surrogate measures of endogenous oestrogen exposure and benign breast disease in a comprehensive fashion. The Cuzick–Tyrer model, based partly on a dataset acquired from the International Breast Intervention Study and other epidemiological data, has now done this. The major advantage over the Claus model and the BRCAPRO model is that the Cuzick–Tyrer model allows for the presence of multiple genes of differing penetrance. It does produce a readout of BRCA1/2, but also allows for a lower penetrance of BRCAX. As can be seen in Table 1, the Cuzick–Tyrer model addresses many of the pitfalls of the previous models; significantly, the combination of extensive family history, endogenous oestrogen exposure and benign breast disease (atypical hyperplasia). In our validation process, the Cuzick–Tyrer model performed by far the best at breast cancer risk estimation [27].

Model validation

The goodness of fit and discriminatory accuracy of the above four models was assessed using data from 1,933 women attending the Family History Evaluation and Screening Programme in Manchester, UK, of which 52 developed cancer. All models were applied to these women over a mean follow-up of 5.27 years to estimate the risk of breast cancer. The ratios of expected to observed numbers of breast cancers (95% confidence interval) were 0.48 (0.37–0.64) for the Gail model, 0.56 (0.43–0.75) for the Claus model, 0.49 (0.37–0.65) for the BRCAPRO model and 0.81 (0.62–1.08) for the Cuzick–Tyrer model (Table 1). The accuracy of the models for individual cases was evaluated using receiver-operating characteristic curves. These showed that the area under the curve was 0.735 for the Gail model, 0.716 for the Claus model, 0.737 for the BRCAPRO model and 0.762 for the Cuzick–Tyrer model.

The Cuzick–Tyrer model was the most consistently accurate model for prediction of breast cancer. The Gail, Claus and BRCAPRO models all significantly underestimated risk, although with a manual approach the accuracy of Claus tables may be improved by making adjustments for other risk factors ('manual method') by subtracting from the lifetime risk for a positive endocrine risk factor (for example, a lifetime risk may change from one in five to one in four with late age of first pregnancy). The Gail, Claus and BRCAPRO models all underestimated risk, particularly in women with a single first-degree relative affected with breast cancer. The Cuzick–Tyrer model and the manual model were both accurate in this subgroup. Conversely, all the models accurately predicted risk in women with multiple relatives affected by breast cancer (that is, two first-degree relatives and one first-degree relative plus two other relatives). This implies that the effect of a single affected first-degree relative is higher than may have been previously thought. The Gail model is likely to have underestimated risk in this group as it does not take into account the age at breast cancer and most women in our single first-degree relative category had a relative diagnosed at younger than 40 years of age. The BRCAPRO, Cuzick–Tyrer and manual models were the only models to accurately predict risk in women with a family history of ovarian cancer. As these were the only models to take account of ovarian cancer in their risk-assessment algorithm, this confirmed that ovarian cancer has a significant effect on breast cancer risk.

The Gail, Claus and BRCAPRO models all significantly underestimated risk in women who were nulliparous or whose first live birth occurred after the age of 30 years. Moreover, the Gail model appeared to increase risk with pregnancy at age < 30 years in the familial setting. It is not clear why such a modification to the effects of age at first birth should be made, unless it is as a result of modifications to the model made after early results suggested an increase with BRCA1/2 mutation carriers [21]. The Gail model, however, has determined an apparent increase in risk with early first pregnancy and it would appear to be misplaced from our results, and from subsequent studies published on BRCA1/2 [22, 23]. Furthermore, the Gail, Claus and BRCAPRO models also underestimated risk in women whose menarche occurred after the age of 12. The Cuzick–Tyrer and manual models accurately predicted risk in these subgroups. These results suggest that age at first live birth has also an important effect on breast cancer risk, while age at menarche perhaps has a lesser effect. The effect of pregnancy at age < 30 years appeared to reduce risk by 40–50% compared with an older first pregnancy or late-age nulliparities, whereas at the extremes of menarche there was only a 12–14% effect.

Our study remains the only one to validate a risk model prospectively, and clearly further such studies are necessary to gauge the accuracy of these and newer models. Indeed, the tendency to modify models to adapt for new risk factors without prospective revalidation in an independent dataset is a problem and can lead to erroneous risk prediction.

BOADICEA model

Using segregation analysis, a group in Cambridge UK have derived a susceptibility model – the Breast and Ovarian Analysis of Disease Incidence and Carrier Estimation Algorithm (BOADICEA) model – in which susceptibility is explained by mutations in BRCA1 and BRCA2 together with a polygenic component reflecting the joint multiplicative effect of multiple genes of small effect on breast cancer risk. The group has shown that the overall familial risks of breast cancer predicted by the model are close to those observed in epidemiological studies. The predicted prevalences of BRCA1 and BRCA2 mutations among unselected cases of breast and ovarian cancer were also consistent with observations from population-based studies. They also showed that their predictions were closer to the observed values than those obtained using the Claus model and the BRCAPRO model. The predicted mutation probabilities and cancer risks in individuals with a family history can now be derived from this mode. Early validation studies have been carried out on mutation probability but not yet on cancer risk prediction.

BRCA1/2risk estimation

A number of models/scoring systems have been derived to assess the probability of a BRCA1 mutation or a BRCA2 mutation in a given individual dependent on their family history. Some of the earlier models, such as the Couch model [42] and the Shattuck-Eidens model [43], were derived before widespread genetic testing had been performed. Two tabular scoring systems have been derived from the Myriad laboratories genetic testing programme [44, 45], with the second based on testing in over 10,000 individuals [45]. The most widely used and validated model is the BRCAPRO model [4648], which requires computer entry of the family history information. More simple scoring systems have been developed, such as the Manchester system [49]. While simple tabular or scoring systems are easy to use and can generate probabilities in 1–2 minutes, computer-based programmes take 10–20 minutes to input. Nonetheless, these may well be carried out in clinics in order to generate pedigrees and store family information. Model-based approaches are also able to take into account unaffected relatives.

Validation studies for the BRCA1/2 risk estimation models are much more widespread than for breast cancer risk over time [4655]. Perhaps the most useful aspect of these is the development of a cutoff point for the intervention of a genetic test at the 10% or 20% level. An assessment of using a cutoff point for several of the models is presented in Table 2. In practice, most genetic testing has been carried out on high-risk families. While a pretesting assessment of the chances of BRCA1/2 involvement is useful, it does not alter the decision-making of whether or not to test a family member if there is a difference between a 20% chance and a 60% chance of a mutation.

Table 2 Sensitivity and specificity at the 10% cutoff point for prediction of identifying a BRCA1/2 mutation in non-Ashkenazi Jewish families for four models/scoring systems based on four validation studies

With genetic testing for BRCA1/2 costing around $3,000, insurance companies and healthcare systems require a threshold for test use. In the United Kingdom this is set at 20% mutation probability [56], but in most of the rest of Europe and North America this is 10%. In order to adequately assess the models at a 10% threshold, a range of families is necessary to test around the threshold. Ideally around 10% of the samples should be mutation positive. As can be seen from Table 2, apart from our own study [50] all the remaining models had detection rates above 20% [5355]. None of the models, tables or scoring systems work perfectly and they need to be adjusted for new information such as the triple-negative grade 3 breast cancer histology associated with BRCA1. For a simple first-line test in the clinic, using a scoring system [50] or using a table [45] will at least give a guide as to whether a family will qualify for testing. With improvements in the computer models such as the BOADICEA model, we will hopefully achieve a more accurate and discriminatory cutoff point.

Conclusion

There are a number of models available to assess both breast cancer risk and the chances of identifying a BRCA1/2 mutation. Some models perform both tasks, but none are yet totally discriminatory as to which family has a mutation and who will develop breast cancer. Improvements in the models are being made, but these require revalidation processes. The discovery of all the alleles associated with breast cancer risk will add a new layer of complexity to all these models.