Introduction

Assessments of detrimental health risks due to exposures from ionizing radiation are often based on studies on survivors of the World War II atomic bombings over Hiroshima and Nagasaki. The life span study (LLS) of these A-bomb survivors continues to provide valuable radiation epidemiological data and quantitative assessments of the radiation-related solid cancer, site-specific and leukemia risks (Preston et al. 2003, 2004, 2007; Ozasa et al. 2012). Results from this cohort have formed a basis for the construction of radiation protection guidelines that include the setting of various dose limits and reference values to the radiation received by occupationally exposed workers and the general public. Such limits and values have come from assessments and recommendations that are issued and updated at regular intervals by national and international bodies, that is, the International Commission on Radiological Protection (ICRP), the United Nations Scientific Committee on the Effects of Atomic Radiation (UNSCEAR), Biological Effects of Ionizing Radiation, U.S.A. (BEIR) and the Environmental Protection Agency, U.S.A. (EPA).

Choices for risk models applied in reports produced by such bodies (UNSCEAR, BEIR VII, EPA and ICRP) are often mainly based upon the Japanese atomic bomb survivor data. Although the specific choice of risk models for risk versus dose–responses and the lifetime risk calculations differ between regulatory groups, there is also disparity in applications of the weights to be applied to excess relative risk (ERR) or excess absolute risk (EAR) models when used in the calculation of lifetime risks. These weights were, up to now, invariably chosen in a qualitative way based on expert opinion. Owing to this process of decision making based on expert judgement, the numerical values of weights to be applied to ERR and EAR models when used in the calculation of lifetime risks vary both between reports produced at different points in time by the same body and also between reports produced at similar points in time by different bodies.

The purpose of this paper is to propose and illustrate a quantitative method for obtaining such weights. It is suggested that this method could be a useful addition and an aid to the qualitative decisions made by experts (which have often been quantified based only on qualitative judgement in the past) by providing additional quantitative results, such that decisions can also be based on measures of evidence specifically linked to the choice of ERR and EAR models. Although this technique is applicable to all measures of lifetime risk, the simplest measure called lifetime attributable risk (LAR) is considered here to illustrate the proposed weighting technique.

Materials and methods

Lifetime risks and the relative weighting of ERR and EAR models used in the calculations

Lifetime attributable risk (LAR—see e.g., Kellerer et al. (2001), (2002) and Vaeth and Pierce (1990)) has a linear dose–response for a linear ERR or linear EAR model. Separate evaluations of LAR can be made using both an EAR model and an ERR model or a mixture of the two. For a person exposed at age e, to a radiation dose, D the LAR is

$$ {\text{LAR}}(D,e) = \int\limits_{e + L}^{{a_{\hbox{max} } }} {M(D,e,a)S(a)/S(e)\;{\text{d}}a} $$
(1)

where M (D, e, a) is the EAR at an age attained, a, after an exposure at age e. S(a) is the survival curve, that is, the probability of surviving to age a, and L is the minimum latency period (e.g., 2 years for leukemia, 5 years for solid cancers). The ratio S(a)/S(e) is the conditional probability of a person alive at age e to reach at least age a. (In Eq. (1) there is an assumed dependence of population statistics on gender). The LAR approximates the probability of a premature incidence of primary cancer from radiation exposure and is a weighted sum (over attained ages up to a max) of the age-specific excess probabilities of radiation-induced cancer incidence, M(D, e, a) can be defined in three alternative ways:

$$ M\left( {D, \, e, \, a} \right) = {\text{EAR }}\left( {D, \, e, \, a} \right); $$
(2a)
$$ M\left( {D, \, e, \, a} \right) = {\text{ERR }}\left( {D, \, e, \, a} \right) \cdot m\left( a \right); $$
(2b)

or a weighted arithmetic sum of both-

$$ M\left( {D, \, e, \, a} \right) = w_{1} {\text{EAR}}\;\left( {D, \, e, \, a} \right) + \left( {1 - w_{1} } \right)\left( {{\text{ERR }}\left( {D, \, e, \, a} \right) \cdot m\left( a \right)} \right). $$
(2c)

where m(a) is the spontaneous cancer incidence rate in a pre-defined population or subpopulation at risk, w 1 is a weighting between 0 and 1, and S(a) is the survival function for the unexposed population. This makes the LAR differ from the quantity REID (risk of exposure-induced death) as used by UNSCEAR because REID has an exposure-dependent survival function. However, LAR and REID coincide at lower doses under about 0.5 Gy (see Fig. 1 of Kellerer et al. 2001).

Model weighting from multi-model inference statistics

An important part of the inferential process in radiation epidemiology and other fields is assessing the preferred model for a dataset by selection involving Akaike’s information criterion (AIC) (Akaike 1973, 1974; Walsh 2007). The introduction of AIC for model selection has been a positive contribution to the field of radiation epidemiology. Akaike’s information criterion is defined by AIC = −2log(MaxLikelihood) + 2k, where k is the number of parameters in the model and the first term on the right-hand side is just the deviance. Models with smaller values of AIC are favored on the basis of fit and parsimony. Model uncertainty contributes a large fraction of the total uncertainty for cancer risk evaluations associated with radiation exposure. Assessing this source of uncertainty in a qualitative way or even neglecting this source of uncertainty altogether is a serious shortcoming of previous evaluations.

Recent statistical literature covers various approaches for combining information from different models fitted to the same data via multi-model inference (MMI) (Hoeting et al. 1999; Posada and Buckley 2004; Chatfield 1995; Claeskens and Hjort 2008; Burnham and Anderson 1998, 2002, 2004). To average over candidate models, a weight is assigned to each model and then measures of interest are inferred across all weighted models. The Akaike weights for model averaging have already been applied in radiation epidemiology to provide an objective basis for model selection and MMI (e.g., Walsh and Kaiser 2011). Within this framework, the Akaike weights (w i, i = 1, 2, …, m) are computed for each model,

$$ w_{i} = \exp [ - 0.5({\text{AIC}}_{i} - \hbox{min} {\text{AIC}})]/\sum\limits_{j = 1}^{m} {\exp [ - 0.5({\text{AIC}}{}_{j} - \hbox{min} {\text{AIC}})],} $$
(3)

where m is the number of models and min AIC is the smallest AIC value among all models considered. The probability that the ith model is the best among the m models considered is quantified by the individual w i values.

A model-averaged estimator of a quantity, \( \hat{\mu } \), can then be obtained as a weighted average of estimators, \( \hat{\mu }_{i} \), from model i (i.e., in the practical terms of the analysis presented here, either the right-hand sides of Eq. (2a) or (2b))

$$ \hat{\mu } = \sum\limits_{{i{ = 1}}}^{m} {w_{i} \hat{\mu }_{i} } . $$
(4)

Consequently, the weights w 1 and 1 − w 1 in Eq. (2c) can be obtained from w 1 and w 2 in a two model (m = 2, one EAR and one ERR) MMI procedure.

EAR and ERR risk models for cancer incidence of various sites and leukemia mortality

Recent site-specific and all solid cancer incidence models for the follow-up 1958–1998 (Preston et al. 2007) (data file: lssinc07.csv, results files: lss07solmod.log, lss07siteahs.log, lss07sitemod.log from www.rerf.or.jp) and preferred leukemia mortality models for the follow-up 1950–2003 selected form a comprehensive MMI study (Kaiser and Walsh 2012), have been selected here for illustration. These results files contain information on model fit parameters and goodness of fit measures obtained from optimizing the models to the data via Poisson regression. The results files were generated with the AMFIT module of the EPICURE software (Preston et al. 1993) and were made available on the internet by Preston and co-workers, after the publication of Preston et al. (2007).

Since the forms of the EAR and ERR models, chosen to illustrate the weighting technique, have been described in Preston et al. 2007 (for all solid cancer and site-specific cancer incidence) and Kaiser and Walsh (2012) (for leukemia mortality), they will not be given explicitly here. The all solid cancer ERR and EAR models (linear in neutron weighted colon dose) and seven site-specific models (linear in neutron weighted organ dose) are fully parametric and all have exponential effect modifiers for age at exposure and effect modification for age attained via power functions even if they are not statistically significant. The preferred leukemia models from Kaiser and Walsh (2012) apply a dose–response with both a quadratic and an exponential dose term, modified by a power function of attained age. The exponential term has the effect of damping the quadratic dose–response to a small extent at higher doses. All solid cancer models documented in the RERF results files are gender-averaged models (except in the cases of sex-specific sites, that is, female breast cancer).

The nine sites considered here to illustrate the weighting technique are—all solid cancer, female breast, lung, colon, stomach, liver, bladder, thyroid cancer and leukemia. The solid cancer sites have both ERR and EAR models available in the results files. Other sites (oral cavity, esophagus, rectum, gall bladder, pancreas, renal cell, central nervous system, uterus, ovary and prostate) only have ERR models documented in the results files, that are linear in organ dose, but with no age-related effect modifiers. In fact, all that is required to apply this new application of the MMI technique is the deviance of the model fit and the number of fit parameters in the model. This information has been extracted from the original results files and is documented in Table 1.

Table 1 Quantification of the relative ERR and EAR weights to apply in the calculation of lifetime risks

However, many of the fit parameters given in the original RERF results files are not statistically significant. A justification for this general RERF approach was given in Pierce et al. (1996) where it was explained that as follow-up continues, risk modifications such as a sex effect will increase in statistical significance and so fit parameters that represent such risk modifications are included even if they are not statistically significant at the time of analysis.

Since small differences in model specifications can lead to differences in LAR projection results, it is more appropriate to consider here the optimized models. In the original paper by Preston et al. (2007), the models for all solid cancer seem to have been applied for the site-specific models considered here with little attention to model selection techniques so that approximately one-third of the original parameters were associated with large p-values much greater than 0.1. Consequently, the fit parameters that corresponded to p > 0.1 in the original Preston et al. 2007 models were removed, and the models re-optimized before the model weights were calculated. This means that both the baseline risks and the age and gender effect modifications in the excess risk part of the model were more appropriately represented and differences in the statistical significance of age or gender effect modifications between EAR and ERR models were taken into account with model selection techniques.

Results

The results of applying this technique to the sites considered here using the original models of Preston et al. (2007), from the computer files on the RERF website, are shown in Table 1. It can be seen, from the last column of Table 1, that the relative EAR weighting, (w 1), by cancer incidence site is zero for breast, 0.11 for colon, 0.12 for all solid and thyroid, 0.19 for lung, 0.22 for bladder, 0.25 for liver and 0.79 for stomach.

Table 2 shows the results of applying this technique to the sites considered here with the optimized models where the fit parameters that corresponded to p > 0.1 in the original Preston et al. (2007) models were removed and the models re-optimized. The weights obtained with the choice of p > 0.1 are very similar to those obtained if p > 0.05 is used. It can be seen, from the last column of Table 2, that the relative EAR weighting, (w 1), by cancer incidence site, is zero for breast and colon, 0.02 for all solid, 0.03 for lung, 0.08 for liver, 0.15 for thyroid, 0.18 for bladder and 0.93 for stomach. The latter results are preferred here because they have been obtained from models in which differences in the statistical significance of age or gender effect modifications between EAR and ERR models were fully and correctly accounted for by standard model selection techniques. For example, in the original liver cancer models of Preston et al. (2007), none of the age and gender excess risk modifiers were statistically significant in the ERR model (the gender and age at exposure effect modifiers both had p > 0.5 and the age-attained effect modifier had p = 0.21) but the age attained (p = 0.008) and gender (p = 0.04) risk modifiers were statistically significant in the EAR model.

Table 2 Quantification of the relative ERR and EAR weights to apply in the calculation of lifetime risks

There is one important refinement to the breast cancer models described on page 34 of Preston et al. (2007), but not explicitly included in the computer files on the RERF website, that should also be considered here. That is explicitly allowing, in the EAR model, for the generally observed change in the trend between female age-specific breast cancer incidence rates and attained age, associated with menopause that is sometimes called Clemmesen’s hook (Clemmesen 1948). Modeling Clemmesen’s hook is just done by allowing effect modification of the EAR by the logarithm of age in the functional form of a quadratic spline with a knot at age 50 years. If Clemmesen’s hook is explicitly accounted for, the EAR weighting increases from 0 to 0.3.

Table 3 compares the central risk estimates obtained with the original solid cancer models of Preston et al. (2007) with the central risk estimates obtained from the optimized models considered here. The central estimates are very similar for all sites, indicating that the further model optimization undertaken by the current authors, only had a minor impact on the central risk estimates.

Table 3 Central risk estimates with standard Wald-type errors, for the models applied in Table 1 (from Preston et al. 2007) and Table 2 (the models from Preston et al. 2007 but re-optimized with the fit parameters with p-values > 0.1 removed)

The leukemia model of Preston et al. (2004), which was originally published only as an EAR model, had a deviance of 2254.8 (value quoted from www.rerf.or.jp, filename:ds02can.log), the analogous ERR model has only been presented recently (Walsh and Kaiser 2011) and had a deviance of 2258.7 with the same number and type of fit parameters. Applying this technique to these models for leukemia mortality results in an EAR weighting of 0.88 (Table 1). However, the MMI procedures of Walsh and Kaiser (2011) and Kaiser and Walsh (2012) identified leukemia models with much higher model weights than the leukemia model of Preston et al. (2004). Application of the preferred models from Kaiser and Walsh (2012) with the follow-up data covering the period (1950–2003), as applied in the analysis by Ozasa et al. (2012), results in an EAR weighting of 0.

From these results, it can be seen that, for the sites considered here, lifetime risk transfer is most highly weighted by EAR only for stomach cancer.

Discussion

In radiation-related cancer risk assessment for a subpopulation at risk, one is often required to transfer the risk obtained from the LSS of atomic bomb survivors to the actual subpopulation at risk. Due to a process of decision making based on expert judgement, the numerical values of weights to apply to ERR and EAR models, when used in the calculation of lifetime cancer risks, vary both between reports produced at different times by the same body and also between reports produced at similar times by different bodies.

In transporting risk estimates from Japan to the U.S.A., BEIR V (1990) assumed a multiplicative model. In contrast to this, BEIR VII/phase2 (2006) applied a weight, wB7 of 0.7 for the estimate obtained using ERR transport for sites other than breast, thyroid, and lung, and a complementary weight of 0.3 for the estimate obtained using EAR transport. This choice was justified in BEIR VII/phase2 (Chapter 10), by acknowledging that there is somewhat greater support for relative risk than for absolute risk transport. However, the BEIR VII/phase 2 (2006) weighting was done on a logarithmic scale. The LAR values were calculated separately based on preferred EAR and ERR models and then combined using a weighted geometric mean, whereby LARB7 = LAR wB7ERR · LAR (1−wB7)EAR . The BEIR VII report acknowledges that the choice of wB7 values “clearly involves subjective judgment”. This geometric mean (GM) approach is not consistent with Eq. (4) and the current literature on MMI (some of which is cited in the materials and methods section of this paper).

The EPA (1994) report also adopted a GM approach stating that this “reflects a judgment regarding the distribution of uncertainty associated with the transportation of risk”. However, the later EPA (2011) report (see also Pawel and Puskin 2012) made two points against the weighted GM approach stating, “First, it is difficult to explain how a projection based on the GM should be interpreted. Second, the GM is not additive in the sense that: the GM of two risk projections for the combined effect of separate exposures is generally not equal to the sum of the GM projections for the exposures.” For these reasons, EPA (2011) employed a weighted arithmetic mean to combine ERR and EAR projections, but still applied the numerical ERR weighting value of 0.7.

ICRP 103 (2007) projections were based on a weighted arithmetic average of ERR and EAR risk model projections. For most sites, ICRP 103 used a “subjective probability” weight (w ICRP ) of 0.5 for the ERR model; exceptions include breast, bone, and leukemia cancers (w ICRP  = 0), thyroid cancer (w ICRP  = 1) and lung cancer (w ICRP  = 0.3).

UNSCEAR (2006) projections (see also Little et al. 2008) were done separately for EAR and ERR transport. Methods for combining site-specific ERR and EAR risk projections for the risk transport problem were not recommended.

Excess relative risks models, often referred to as multiplicative, are appropriate if radiation risks are proportional to baseline rates, and EAR models (additive) are an alternative if radiation risks add to baseline rates. Since the gender and age patterns are often markedly different between EAR and ERR models, the Akaike model weights can help to determine the relative goodness of fit of the two types of models to the data. It can be seen from the results section that lifetime risk transfer is most highly weighted by EAR only for stomach cancer. The resulting Akaike model weights could be applied directly, if the projections are done from the LSS to a population at risk with similar genetic and environmental characteristics and similar baseline risks (e.g., the Japanese subpopulation at risk after the Fukushima nuclear power plant radiation release in 2011).

However, such projections are also required when large differences are observed in comparisons of baseline risks between the subpopulation at risk and the LSS for some site-specific cancers. For example, baseline risks for cancers of the colon, lung and female breast are higher in the USA than in the LSS, whereas according EPA (2011), ICRP 103 (2011) and UNSCEAR (2006) baseline risks for cancers of the stomach and liver are much higher in the LSS than in the USA.

In these cases, estimates based on relative and absolute risk can differ substantially and additional considerations are necessary. These include (a) comparisons of radiation risks from epidemiological studies on non-Japanese populations with the LSS results, (b) evaluations of the interaction of radiation and other factors that contribute to differences in baseline rates and (c) considerations of biological mechanisms of carcinogenesis. If inter-population and non-Japanese and interaction studies are sparse for particular cancer sites with large differences in comparisons of baseline risks between the subpopulation at risk and the LSS, then just applying the LSS Akaike weights may lead to misleading results and caution is required.

For example, the UNSCEAR (2006) stomach cancer estimates for the population of the USA based on absolute risk transport are approximately an order of magnitude larger than those based on relative risk transport (and inter-population and interaction studies are sparse for stomach cancer). In this extreme case, a direct additive transfer mode of the LSS EAR (EARLSS) to the much lower USA baseline rates would lead to a proportionately very high radiation risk. It is currently not known if the EARLSS for stomach cancer should then be 1) applied directly to the USA baseline rates (BLUSA) or 2) scaled first and then applied, that is, applied as EARLSS *(BLUSA/BLLSS) or 3) applied as an ERR that has been calculated using the parameters from the preferred LSS EAR model. Due to lack of knowledge on the interactions of radiation and other factors that contribute to differences in baseline rates, especially in the case of stomach cancer, there is no evidence-based reason for preferring transfer mode 1, 2 or 3.

Weights from one radio-epidemiology study may be different from those obtained in a different study for the same cancer site. Some examples of this for breast and thyroid cancer are given later in this section. Another consideration—as pointed out in the BEIR VII/phase 2 (2006) report—is that ERR models used to obtain relative risk transport estimates may be less vulnerable to possible bias from under ascertainment of cases. A further consideration is that since one in six cancers is caused by infections (de Martel et al. 2012), (e.g., Helicobacter pylori: stomach cancer and hepatitis B and C viruses: liver cancer)—a result that the ERR model describes the variation of radiation-associated risks in LSS with age-at-exposure and attained age better or worse than the EAR model, may only partially explain how radiation-associated liver cancer or stomach cancer risks compare between the LSS and the subpopulation at risk (i.e., between Japan and for example the USA). A high prevalence of infection with hepatitis may act as a confounding factor in the LSS (UNSCEAR 2006). In such cases, genetic or environmental risk factors need additional assessment prior to the calculation of LAR. Usually, the computation of LAR is done either for a typical exposed individual person or an exposed population, under a set of assumptions concerning genetic or environmental risk factors, that is, it is assumed that the typical exposed individual is also typical with respect to genetic susceptibility or that the pattern of baseline cancers in an exposed population is not atypical (i.e., through a population susceptibility to a certain type of cancer (e.g., adult T cell leukemia is endemic Nagasaki (Arisawa et al. 2002))). Preliminary work on testing a set of assumptions concerning genetic or environmental risk factors should be done and based on expert opinion, prior to the LAR calculations.

Clearly, even after consideration of such points, unless one adopts the UNSCEAR (2006) approach, there is a need for a method that is based on the current statistical literature on MMI and quantitative weighting founded on evidence-based techniques, rather than subjective probabilities. The suggested application of the technique presented here, aims to help to fulfill this need and it is recommended with the cautions as already explained, as an additional aid to expert opinion for future health risk assessments.

The next few paragraphs consider some of the model-based evidence obtained from the literature for leukemia, breast, thyroid and lung cancer in order to see if the LSS results from the approach presented here can be validated.

Model-based evidence obtained from the literature for leukemia

Little et al. (1999) found that relative risk models which account for leukemia sub-type provide a reasonable fit to data on A-bomb survivors, cervical cancer patients and spondylitis patients. Little (2008) presented leukemia risk models for childhood radiation exposure to the LSS data and data from several medical studies and found that “a relative risk transfer may be more appropriate than an absolute risk transfer between the Japanese A-bomb survivors and these three childhood populations”. Little et al. (2008) presented two optimal models for leukemia risk models employed in UNSCEAR (2006). From the goodness of fit parameters in Table 2 of Little et al. (2008), the ERR weights are 0.98 and 0.97 for the two optimal models with the linear-quadratic dose–response and the linear-quadratic-exponential dose–response, respectively. The proposed method produces very similar results using information from all of these papers and leads to the conclusion that, for leukemia, projections should be primarily based on the ERR model (ERR weight = 1.0).

Model-based evidence obtained from the literature for female breast cancer

Preston et al. (2002) derived breast cancer risks in eight different cohorts and, although their findings did not provide a clear answer to the question of how risk projection should be done, recommend that either EAR estimates or a scaling of the attained age ERR model by the ratio of the baseline rates in Japan to those in the population of interest should be used. Preston et al. (2002) also stated that “Formal statistical comparison of the fits of the excess relative risk and absolute excess rate models was not possible. An informal comparison of the deviance values for the various fitted models considered suggested that, while deviances for the ERR models tend to be slightly smaller than those for the EAR models, both types of models provide comparable fits to these data” (page 231, 2nd column, 2nd paragraph from the top). However, there is enough information on the deviance of the final pooled data models of Preston et al. (2002) to compute the relative goodness of fit. One can continue to read further in Preston et al. (2002) that “final ERR model has deviance = 5849.3 and final EAR model has deviance of 5854.7 with two parameters more”. From this information, the change in Akaike Information Criterion (AIC) is 9.4 which gives a calculated evidence ratio in favor of the ERR model over the EAR of 110 corresponding to a probability of model improvement of 0.991, that is, see Table 2 of Walsh (2007). In this type of comparison of the fits, an ERR weight of 0.99 can be calculated which is similar to the results presented here in Table 2 (ERR weight of 1 or 0.7, if Clemmesen’s hook is accounted for). Both sets of weights lead to the conclusion that projections should be primarily based on the ERR model. This conclusion is also supported by a recent MMI analysis of breast cancer incidence in A-bomb survivors by Kaiser et al. (2012), whereby none of the EAR models could be included in the protocol selected set of models considered for MMI, due to the poor quality of their fit to the data relative to ERR and mechanistic type models.

Little and Boice (1999) provided a detailed comparison of breast cancer risks in the A-bomb survivors to those in the Massachusetts Tuberculosis Fluoroscopy (MTBF) cohort. Applying the goodness of fit parameters from Tables 2 and 3 of Little and Boice (1999), it is possible to calculate weights for the earlier A-bomb incidence dataset with follow-up: 1958–1987 as described in Thompson et al. (1994) (ERR weight = 1) and the MTBF cohort (ERR weight = 0.35). This discrepancy in the ERR weights could be related to different molecular subgroup types of breast cancer, possibly with different sensitivities to radiation, occurring with different frequencies in the USA and Japan—since breast cancer has been shown by Curtis et al. (2012) to be divisible into 10 novel molecular subgroups based on the impact of somatic copy number aberrations. Such discrepancies indicate that the weights for breast cancer risk transfer may need to be chosen more specifically for the population of interest, that is, a 35 % ERR, 65 % EAR transfer may be more appropriate for calculating LAR of breast cancer for a USA or western population and a 100 % (or 70 %, if Clemmesen’s hook is accounted for) ERR transfer may be more appropriate for calculating LAR of breast cancer for a Japanese or Asian population. An alternative evidence-based method, for applying discrepant weighting results from different cohorts with a known similarity of cancer sub-types, could be to apply an outer weighting (by some appropriate measure of study size) to the individual study weights.

Model-based evidence obtained from the literature for thyroid cancer

A pooled study on thyroid cancer risks from seven individual studies has been presented by Ron et al. (1995). In Table 6 of Ron et al. (1995), there is enough information given to calculate the weights for the earlier LSS incidence data, and three cohort studies on children irradiated for various maladies (see Table 4). Models for two studies (the largest study, based on 309 thyroid cancers in children irradiated for enlarged tonsils in Chicago and a study of 60 thyroid cancers in children treated for Tinea capitis) result in an EAR weighting of 0. However, models for the study on children irradiated for an enlarged thymus gland, based on 38 thyroid cancers, results in an EAR weighting of 0.88.

Table 4 Collection of thyroid cancer studies from which the relative weights for ERR and EAR can be calculated

Consideration of the goodness of fit of models from a study of children affected by the Chernobyl nuclear power plant accident (Table 1 of Jacob et al. 2006, gives parameters for an EAR and an ERR model with the same AIC as each other, that is, ΔAIC = 0.007) results in an ERR weighting of 0.5. Another study on children affected by the Chernobyl nuclear power plant accident, Likhtarov et al. (2006), provides goodness of fit parameters required for the calculation of the weights in Tables 6 and 7 of Likhtarov et al. (2006) for models 1–4 of that study. These four models are for a baseline plus linear dose–response (model 1), whereby the interaction of dose with either gender (model 2), age in 1986 (model 3) or calendar year period (model 4) was also given. Models 1–3 indicate a ERR weight of 1 but model 4 indicated an EAR weight of 1 (but failed to provide converged EAR risk estimates—see Table 4). The general conclusion is that, for thyroid cancer, projections should be based on a mixed model that is most heavily weighted toward the ERR (ERR weight = 0.85, EAR weight = 0.15).

Model-based evidence obtained from the literature for lung cancer

A recent lung cancer analysis (Furukawa et al. 2010) indicated a complicated interaction between lung cancer and smoking based on an analysis applying the ERR model, that is, they applied generalized joint effect models, which they called “generalized additive and multiplicative ERR interaction models” and found stronger evidence than an earlier analysis (Pierce et al. 2003) against the additive approach. This indication partially supports the proposed method which provides the conclusion that, for lung cancer, projections should be based on the ERR model (ERR weight = 0.97).

Other methods applied to assign qualitative weights

Projection of risks for purposes of radiation protection makes the assumption that the LSS risk estimates can be applied to other exposed populations. In order to assess the evidence for this assumption, it is useful to consider data from different populations in the few pooled analyses that exist, such as Preston et al. (2002) and Ron et al. (1995) for breast and thyroid cancer, respectively. Comparisons are often made between the actual values of individual study EAR and ERR risks given in such pooled analyses to assess for overall agreement. It can be seen from Table 6 in Ron et al. (1995) that the individual study EAR central risks estimates agree with each other much better than the corresponding ERR risks (i.e., in four of the cohort studies, for persons exposed under 15 years of age, the ERR per unit dose (range, with 95 % CI 2.5 (0.6; 20.0)–32.5 (14.0; 57.1)/Gy) differed by a factor of 13 compared to a factor of 3 for the EAR unit dose (range, with 95 % CI 2.6 (1.7; 3.6)–7.6 (2.7; 13.0)/104 person–years Gy)—although in both cases the confidence intervals of the upper and lower range values overlap. Similarly Preston et al. (2002) noted a better agreement between the individual study EAR central estimates than between ERR central estimates. The latter observation for breast cancer was considered in the assignment of expert judgment weights for the EAR of 100 % in ICRP 103 (2007). An additional consideration here is that although consistency in risk estimates may be observed in some projections of the multi-dimensional LSS EAR (and/or ERR) models with models (or point estimates) from other studies—the degree of observed consistency may change between different model projections. ICRP 103 (2007) also pointed out that the use of EAR models for predicting cancer risks in sites generally associated with regular screening is problematic because variation in screening intensity will have a marked effect on the rate of identified radiation-associated cancers. The results obtained from the proposed quantitative method of determining weights with the LSS data are mostly consistent with this concern of ICRP 103 (2007), but not with the results regarding inter-study agreements in central risk estimates for thyroid and breast cancer.

The main aim of the work presented here is to describe a quantitative method that could aid expert opinion in future transfers of radiation risks from one population to another. Further work is required to expand on the calculations presented here and to perform an exhaustive analysis of weights calculated from results currently available in the literature. It is recommended here that, in future analyses of cohort data relating to radiation risks, the goodness of fit parameters for parsimonious ERR and EAR models, obtained with a thorough modeling of baseline rates, could also be provided as an aid to determining whether the additive or multiplicative models fit the data best. Such parsimonious models could be considered even if authors publish and actually prefer a categorical model based on many subgroups of explanatory co-variables as their main results. Finally, it is noted that the transfer of radiation risks from one population to another is not limited to a mixture of just one ERR and one EAR model—in future a full MMI procedure could be applied instead, so that model uncertainty is account for more completely.

Conclusion

In the transport of radiation risks from one population (i.e., the LSS of atomic bomb survivors) to another population at risk, transfer via ERR or EAR or a mixture of both has been performed in the past in a qualitative and inconsistent way. This paper identifies an approach that could aid expert judgment and could be applied, with the cautions given, to help achieve consistency of approach and evidence-based results and therefore contribute to future health risk assessments.

However, it is important to state that definitive conclusions, regarding the appropriate method for transporting cancer risks, are limited by a lack of knowledge in several areas. Such areas include, but are not limited to, unknown factors and uncertainties in biological mechanisms and genetic and environmental risk factors for carcinogenesis, uncertainties in radiation dosimetry and insufficient statistical power and/or incomplete follow-up in data from radio-epidemiological studies. It is also particularly important to acknowledge that the generalization and interpretation of radiation effect estimates based on the LSS cancer data, when projected to other populations, are particularly uncertain for cancer sites where considerable differences exist between site-specific baseline rates in the LSS and the other populations of interest.