Radiation and Environmental Biophysics

, Volume 50, Issue 1, pp 21–35

Multi-model inference of adult and childhood leukaemia excess relative risks based on the Japanese A-bomb survivors mortality data (1950–2000)

Authors

    • Department “Radiation Protection and Health”Federal Office for Radiation Protection
  • Jan Christian Kaiser
    • Institute of Radiation Protection, Helmholtz Zentrum München
Original Paper

DOI: 10.1007/s00411-010-0337-6

Cite this article as:
Walsh, L. & Kaiser, J.C. Radiat Environ Biophys (2011) 50: 21. doi:10.1007/s00411-010-0337-6

Abstract

Some relatively new issues that augment the usual practice of ignoring model uncertainty, when making inference about parameters of a specific model, are brought to the attention of the radiation protection community here. Nine recently published leukaemia risk models, developed with the Japanese A-bomb epidemiological mortality data, have been included in a model-averaging procedure so that the main conclusions do not depend on just one type of model or statistical test. The models have been centred here at various adult and young ages at exposure, for some short times since exposure, in order to obtain specially computed childhood Excess Relative Risks (ERR) with uncertainties that account for correlations in the fitted parameters associated with the ERR dose–response. The model-averaged ERR at 1 Sv was not found to be statistically significant for attained ages of 7 and 12 years but was statistically significant for attained ages of 17, 22 and 55 years. Consequently, such risks when applied to other situations, such as children in the vicinity of nuclear installations or in estimates of the proportion of childhood leukaemia incidence attributable to background radiation (i.e. low doses for young ages and short times since exposure), are only of very limited value, with uncertainty ranges that include zero risk. For example, assuming a total radiation dose to a 5-year-old child of 10 mSv and applying the model-averaged risk at 10 mSv for a 7-year-old exposed at 2 years of age would result in an ERR = 0.33, 95% CI: −0.51 to 1.22. One model (United Nations scientific committee on the effects of atomic radiation report. Volume 1. Annex A: epidemiological studies of radiation and cancer, United Nations, New York, 2006) weighted model-averaged risks of leukaemia most strongly by half of the total unity weighting and is recommended for application in future leukaemia risk assessments that continue to ignore model uncertainty. However, on the basis of the analysis presented here, it is generally recommended to take model uncertainty into account in future risk analyses.

Introduction

The assessment of detrimental health risks due to exposures from ionizing radiation has been an endeavour that has increased in magnitude and effort over the last century. Solid cancer and leukaemia incidence and mortality have emerged, from the many indicators of cellular damage and health effects investigated to date, as having radiation as an important and proven risk factor. Studies on survivors of the World War II atomic bombings over Hiroshima and Nagasaki continue to provide valuable radiation epidemiological data and quantitative assessments of the radiation-related solid cancer and leukaemia risks (Preston et al. 2003, 2004, 2007).

A related, recurring, research topic is whether increased risks for childhood leukaemia incidence exist in geographical regions near nuclear power stations or other installations related to the nuclear industry (Laurier et al. 2008). Nearly all of the 198 local nuclear site studies and 25 multi-site descriptive studies of leukaemia risk among children and young adults in the vicinity of nuclear facilities, recently reviewed (Laurier et al. 2008), have concluded that no significant leukaemia excesses exist. However, some local clusters of leukaemia cases are apparent and levels of concern rose recently, after a German study reported indications of a decreasing leukaemia incidence risk with distance from nuclear power plants among children under 5 years of age (Kaatsch et al. 2008). Another related issue involves estimates of the proportion of childhood leukaemia incidence that may be due to natural background ionizing radiation (e.g. about 15–20% in the United Kingdom according to Little et al. (2009), Wakeford et al. (2009)). Such estimates rely on the risk models developed with the Japanese A-bomb epidemiological data.

In view of these concerns, it is of interest to take a new detailed look at the models for leukaemia risks from ionizing radiation that have been fitted recently to data from the Japanese A-bomb Life Span Study (LSS) mortality studies. Although the modern studies mentioned earlier involve leukaemia incidence, whereas the LSS dataset considered here is for leukaemia mortality, it is assumed here that incidence and mortality are interchangeable for inter-study comparison purposes, because the treatment options were very limited in the past. A general review of recently published leukaemia models reveals here that—although existing models can be applied to young ages at exposure and short times since exposure of above 5–10 years—all published leukaemia models yield risks and uncertainties that are centred at adult age at exposure and middle-aged attained age, for several decades after exposure. The necessary calculations required to derive risks from these models at young ages at exposure and short times since exposure involve considerable difficulties and uncertainties associated with combining many fitted parameters, unpublished parameter correlations and very considerable model-to-model variations.

The purpose of the present paper is to re-examine and re-fit, under the same set of conditions, nine currently published leukaemia models (Preston et al. 2004; Little et al. 2008; UNSCEAR 2006; Schneider and Walsh 2009; Richardson et al. 2009; BEIR 7 phase 2 2006) that have been fitted to the most recent epidemiological mortality data (Preston et al. 2004) from the Japanese A-bomb survivors. This involved derivations of the overall Excess Relative Risks (ERR) of leukaemia mortality for the models, both as they currently stand in the literature and also with modifications to produce a degree of compatibility between models. Most of the models considered here have also been centred at various young ages at exposure and shorter times since exposure, in order to obtain specially computed childhood risks and uncertainties that account for correlations in the fitted parameters associated with the dose–response. Techniques of multi-model inference have also been applied to these nine models, and the risks of leukaemia in adults aged 55 years and in young persons aged 7, 12, 17 and 22 years, which include model uncertainty, have been obtained.

Materials and methods

Model choice and multi-model inference statistics

Datasets in radiation epidemiology are observational and often involve numerous covariables that can be fitted to various forms of models. Consequently, assessing the preferred model by selection involving Akaike’s information criterion is an important part of the inferential process (Akaike 1973, 1974; Walsh 2007). Akaike’s information criterion is defined by AIC = –2log(MaxLikelihood) + 2 k, where k is the number of parameters in the model. Models with smaller values of AIC are favoured on the basis of fit and parsimony. However, using the data to choose a model and progressing to subsequent inference, assuming that the selected model has been chosen a priori, is a process that fails to acknowledge the uncertainties present in the model selection process (Chatfield 1995). Indeed, a large source of uncertainty for cancer risk evaluations associated with radiation exposure is model uncertainty. Neglecting this source is a serious shortcoming of previous evaluations. The modelling of leukaemia risks has been extended in this paper to take account of model uncertainties by allowing all currently circulating models (known to the authors) to contribute to risk estimation.

Recent statistical literature covers various approaches for model averaging (Hoeting et al. 1999; Posada and Buckley 2004; Claeskens and Hjort 2008). The work by Burnham and Anderson (1998, 2002) has been influential and has resulted in a major paradigm shift away from hypothesis testing as a tool for model choice. To average over all candidate models, a weight is assigned to each model, and then measures of interest are inferred across all weighted models. The introduction of AIC for model selection has been a positive contribution to the field of radiation epidemiology. In other fields, such as biology and ecology, the AIC weights for model averaging have already been providing an objective basis for model selection and multi-model inference (e.g. Zhang and Townsend 2009). Within the AIC framework, the weights (wi, i = 1, 2,…, m) are computed for each model,
$$ w_{i} = {{{\rm {exp}} [ -0.5(\text{AIC}_{i} - {\rm {min}} \text{AIC})]} \mathord{\left/ {\vphantom {{\rm {exp} [ -0.5(AIC_{i} - {\rm {min}} AIC)]} {\sum\limits_{{j = 1}}^{m} {\rm {exp} [ -0.5(AIC_{j} - {\rm {min}} AIC)]} }}} \right. \kern-\nulldelimiterspace} {\sum\limits_{{j = 1}}^{m} {\rm {exp} [ -0.5(AIC_{j} - {\rm {min}} AIC)]} }} $$
(1)
where m is the number of models and min AIC is the smallest AIC value among all models considered.
The minimum AIC is subtracted merely to avoid numerical problems caused by very high or very low arguments inside the exponential function. Individual wi values can be interpreted to indicate the probability that the ith model is the best among the m models considered. A model-averaged estimator of a quantity, \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } \) (which is, in the practical terms of the analysis presented here, either the ERR/1 Sv or the low-dose limiting ERR/Sv or 100 ERR/10 mSv) can then be obtained as a weighted average of estimators, \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu }_{i} \), from model i
$$ \hat{\mu } = \sum\limits_{i = 1}^{m} {w_{i} \hat{\mu }_{i} .} $$
(2)
It is also possible to replace AIC in Eq. (1) either by the Bayesian Information Criterion (BIC) (see Walsh (2007) for an explanation of BIC) or other information criteria for special purposes (Claeskens and Hjort 2008). Approaches have also been developed to derive parameters of the distribution of \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } \) (Hoeting et al. (1999), Claeskens and Hjort (2008)).
The confidence intervals for the model averages \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } \) were calculated here by two methods. Method 1 involved evaluating Eq. 1 of Hoeting et al. (1999) (under the assumption that all models are equally likely a priori) with Monte-Carlo simulated sub-sets of realizations, i.e. one sub-set per model, where the size of each sub-set corresponded to its Akaike weight. Then, the sub-sets pertaining to separate models (e.g. 500, 300, or 200 realizations from models 1, 2 and 3 with Akaike weights of 0.5, 0,3 and 0.2, respectively) were merged to form a total set of 1,000 realizations. The realizations were then sorted, and the percentiles, corresponding to the level of confidence required, were located and adopted as upper and lower confidence intervals. Method 2 involved application of the following formula for the standard error (SE) of the model-averaged \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } \), from Burnham and Anderson (2004):-
$$ {\text{SE}}(\hat{\mu }) = \sum\limits_{i = 1}^{m} {w_{i} } \left){\vphantom{1{{\text{Var}}\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } } \right) + \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu }_{i} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } } \right)^{2} }}}\right. \!\!\!\!\overline{\,\,\,\vphantom 1{{{\text{Var}}\left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } } \right) + \left( {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu }_{i} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{\mu } } \right)^{2} }}}. $$
(3)

Data on leukaemia mortality

The cohort of the atomic bomb survivors from Hiroshima and Nagasaki is unique owing to: the large number of cohort members; the long follow-up period of more than 50 years; a composition that includes males and females, children and adults; whole-body exposures (that are more typical for radiation protection situations than the partial-body exposures associated with many medically exposed cohorts); a large dose range from natural to lethal levels; and an internal control group with negligible doses, i.e. those who survived at large distances (>3 km) from the hypocentres. The most recent dataset on cancer mortality for the follow-up time periods from 1950 to 2000 (Preston et al. 2004) (data file: DS02CAN.DAT from http://www.rerf.or.jp), with the new dosimetry system DS02 (Young and Kerr 2005), has been selected for most recently published leukaemia models. The mortality data are in a tabulated grouped form and are categorized by gender, city, age-at-exposure (5-year intervals), age-attained (5-year intervals), the calendar time period during which the health checks were made, and weighted survivor marrow dose.

Weighted doses

Weighted organ doses are defined by
$$ d = d_{\gamma } + RBE\,d_{n} , $$
(4)
where dγ and dn are organ absorbed doses from γ-rays and neutrons, respectively. For RBE, the relative biological effectiveness of neutrons, the value 10 has been used. This value was chosen because it is used in most of the current analyses of the LSS data, although there are arguments that other values might be more appropriate (Rühm and Walsh 2007). Since this analysis involves currently published models for all types of leukaemia grouped together, weighted marrow doses have been applied with dose categories truncated to correspond to the 4-Gy kerma level. In common with previous analyses (e.g. Preston et al. 2004), statistical methods are used to reduce risk-estimation bias resulting from imprecision in individual dose estimates. The necessary adjustments for random errors in dosimetry applied to the dose term are already applied in the publicly available data, but a separate adjustment involving a multiplication factor to the dose-squared covariable should be done explicitly, according to either Pierce et al. (1990) (factor 1.12) or Pierce et al. (2008) (revised factor 1.15). Since most of the published analyses apply the factor 1.12, this has been adopted here.

The leukaemia models for A-bomb survivors

The risk models for radiation-induced leukaemia mortality applied here were selected by two criteria, i.e. that they were already published and based on the most recent leukaemia mortality dataset that is described earlier (Preston et al. 2004; Little et al. 2008; UNSCEAR 2006; Schneider and Walsh 2009; Richardson et al. 2009; BEIR 7 phase 2 2006). To the author’s knowledge, the selection is complete by these two criteria. The first paper to report a leukaemia model, based on the then pre-release mortality data (1950–2000), was that of Preston et al. (2004). This was, however, an Excess Absolute Risk (EAR) model that has been recomputed here (refitted—in AMFIT) in the original parameter formulation, but as an ERR model (with fitted parameters given in the Appendix Table 6) for inclusion in the model-averaging process. The paper by Little et al. (2008) reports on the models that were adopted in the UNSCEAR (2006) report. All models were identical between these latter two publications except the leukaemia ERR model that had a term adjusting for age at exposure in Little et al. (2008); this adjustment was not included in the UNSCEAR (2006) model, and therefore, both models are considered here. The models of Schneider and Walsh (2009), developed for assessing health risks to astronauts, are very similar to those in Little et al. (2008) and UNSCEAR (2006) even though they were developed independently at the same time, however, since they are very similar to previously published models for solid cancer (Preston et al. 2003), this is not surprising. The paper by Richardson et al. (2009) contains a model (in Table 3 of that paper) based on a cubic spline function of time since exposure but fitted to a dataset (not available to other members of the scientific community) that contains 14 additional leukaemia deaths and more detailed information on the types of leukaemia than the generally available data. For this reason, the model used by Richardson et al. (2009) for all types of leukaemia has been fitted here to the publicly available data for inclusion in the model-averaging process. Details of the latter model fitted to the publicly available data are presented here in the "Appendix" where Table 7 gives the fitted parameters and Fig. 2 shows a version of the Fig. 1 from Richardson et al. 2009 repeated for the parameters in Table 7. Table A1 in the "Appendix" of Richardson et al. (2009) also gives parameters for the BEIR 7 phase 2 (2006) model with two additional, similar, models that have also been fitted to the publicly available data; all three of these models were included in the model averaging here.
https://static-content.springer.com/image/art%3A10.1007%2Fs00411-010-0337-6/MediaObjects/411_2010_337_Fig1_HTML.gif
Fig. 1

Model-averaged ERR at 1 Sv values with Monte-Carlo simulated 95% CIs for various values of attained age in years from Table 4 (top panel) The lower panel shows the same ERR values again without confidence intervals for comparison with the low-dose limiting ERR/Sv and 100 and 10 times the model-averaged ERR at 10 and 100 mSv, respectively, from Table 5. Note the values at attained age 12, 17 and 22 years are for 10 years since exposure, and the value at attained age 7 is for 5 years since exposure

The present analysis, which is based on the models mentioned above, is gender averaged to facilitate the model-to-model comparisons. All models considered made use of a general rate (hazard) model of the form
$$ \lambda \left( {d,a,e,t,c,s,l} \right) = \lambda_{0} \left( {a,e,t,c,s,l} \right)\left[ {1 + {\text{ERR}}\left( {d,a,e,t,c,s,l} \right)} \right], $$
(5)
for the ERR, where λ0(a, e, t, c, s, l) is the baseline cancer death rate, a is attained age, e is age at exposure, t is time since exposure, c is a two-level city indicator variable for Hiroshima and Nagasaki, s is a gender indicator variable and l is a two-level indicator variable for survivor location from the bomb hypocentre, i.e. proximal (<3 km) or distal.

The functional forms for the baseline model parts (i.e. to account for the spontaneous cancer rates that would have occurred in the absence of ionizing radiation) are given in Table 8 of the "Appendix" where it can be seen that seven models adopted a fully parametric approach and two dealt with baseline rates via stratification.

All models considered have an ERR form factorized into a function of dose and a modifying function that depends on some choice of a, e, t, c, s and l. The functional forms vary and are given in Table 1.
Table 1

Forms of excess relative risk models applied in the model-averaging procedure

Model reference

Form of ERR model

EAR model of Preston et al. (2004) (used here as an ERR model)

(1 + ωs.s). (d + θ.d2). exp{θ1–3. ecat1–3 + τ1–3.ecat1–3. ln(t/25)}

BEIR VII, phase 2 (2006)

(βs.s). (d + θ.d2). exp (γ e′ + δ ln(t/25) + ϕ e′ln(t/25)), where e′ = min(0,(e−30)/10)

Richardson et al. (2009) model 2, their Appendix Table A1

As BEIR VII, phase 2 (2006) ERR model above—(only differs in the baseline)

Schneider and Walsh (2009)

(1 + ωs.s). (β.d + α.d2). exp (-γ (e−30) + ε ln(a/55))

(1 + ωs.s). (β.d + α.d2). ν.exp (d). exp (-γ (e−30) + ε ln(a/55))

Little et al. (2008)

(β.d + α.d2). exp (γ ln(e/30) + ε ln(a/55))

UNSCEAR (2006)

(β.d + α.d2). exp (ε ln(a/55))

Richardson et al. (2009) main model for all types of leukaemia, their Table 3

(1 + ωc.c). (β.d + α.d2). exp (γ e′ + ϕ1e′t + ϕ2e′t2 + ϕ3e′t3 + ϕ5e′(t30)+3), where e′ = min(0,(e−30)/10) and (t30)+3 = (t30)3 if t30 > 0, 0 otherwise.

Richardson et al. (2009) model 3, their Appendix Table A1

As BEIR VII, phase 2 (2006) ERR model above—(only differs in the baseline)

βs and the other Greek symbols are fitted parameters, d is the weighted marrow dose, a is attained age, e is age at exposure, t is time since exposure, c is a two-level city indicator variable for Hiroshima and Nagasaki and s is a gender indicator variable. ecat1–3 is a three-level indicator variable for three age at exposure groups (0–19, 20–39 and 40 + years)

It is instructive here to explain, using an example, a property of the age and time risk centring constants that simplifies the computation of uncertainties. Consider splitting ERR(d, a, e) into ERR(d) * exp (-γ (e−30) + ε ln(a/55)), where the ERR at unit dose is for an age at exposure of 30 years and an attained age of 55 years, and γ, ε are fitted parameters. The model centring at age at exposure of 30 years and an attained age of 55 years (or time since exposure of 25) serves as a reference for the main ERR dose–response, the fitted parameters and their uncertainties. In all but one of the nine models considered (i.e. the Richardson et al. (2009) model in Table 3 of that paper), different choices of centring values will not affect the overall quality of fit (the deviance value). Such a model can then be refitted at different centring ages, for example e = 7, a = 17, and the ERR at 1 Sv can be found by combining only the fitted parameters relevant to the dose–response. This is more efficient than having to combine all ERR fitted parameters for dose, age and time, and saves a considerable effort in the evaluation of the relevant uncertainties. Currently it is not possible to reliably compute such uncertainties from published information, even though all ERR fitted parameter estimates are published, because none of the papers on the nine models considered contain information on the parameter correlations. The re-evaluation of the models done here has applied this property of the centring constants that simplifies the computation of uncertainties, i.e. the models were re-centred (if necessary) at the ages of interest, i.e. for adults a = 55, e = 30, t = 25, and for children at a series of values a = 2, 7, 12 years (approximately in the middle of the data categories) for t = 5 and 10 years. The computation of the ERR and its uncertainty then involved the combination of only the dose–response parameters where the associated uncertainties were computed by Monte-Carlo simulations, which took the relevant elements of the full parameter correlation matrix (i.e. without fixing the baseline parameters) into account, using the Crystal-Ball software with 1,000 realizations per simulation.

Estimation of fitted parameters and statistical analysis

The maximum likelihood technique (Harrell 2001) was used to re-fit all the nine models here at the several combinations of age and time centring described in the last section. Best estimates, uncertainty ranges, which included both Wald-type and Likelihood-based confidence intervals, and correlations of the fitted parameters were determined as in the original publications by minimizing the deviance using the AMFIT module of EPICURE software (Preston et al. 1993).

Results

Epidemiological details

Since this work involves mainly a re-analysis of previously published models, most of the important epidemiological details, such as total number of subjects (86,611), total number of leukaemia deaths (296) or total person-years at risk (3.18 million), have already been given (Preston et al. 2004). However, since the models are re-centred at young ages at exposure and short times since exposure, it is important to report here, in Table 2, on the amount of data supporting this region of the dataset. Since the epidemiological follow-up did not begin until 5 years after the A-bombs, the youngest available attained age category is 5– < 10 years of age. It can be seen from Table 2 that this first available age category has four leukaemia deaths (cases) and 22,796 person-years (person-year-weighted mean weighted marrow dose and range, d = 0.12; 0–4.4 Sv (in the category) and 1.85; 1.31–4.13 Sv (for the cases in this category)) i.e. all four cases had very high doses. The minimum attained age is 7.37 years, and in the category of age at exposure <5 years, there were 13 deaths from leukaemia before the attained age of 20 years.
Table 2

Details of the amount of data supporting the models at young ages at exposure and short times since exposure

Attained age category (years)

5–<10

10–<15

15–<20

20–<25

Number of leukaemia deaths

4

6

7

9

Number of person-years

22,796

70,167

106,226

154,253

Category-specific marrow dose; and range (Sv)

0.12; 0–4.4

0.12; 0–4.3

0.12; 0–4.5

0.13; 0–4.5

Case-specific marrow dose and range (Sv)

1.85; 1.31–4.13

0.08; 0–3.37

0.51; 0–4.28

0.29; 0.01–2.73

The last two rows give the mean marrow doses that have been weighted both by RBE as in Eq. 4 and also for person-years

The quality of the model fit and associated values of parameters, information criterion and weights for model averaging

Table 3 gives the number of model parameters, the goodness of model fit to the data in terms of the deviance values, and the Akaike weights assigned to each model. It is noteworthy that the UNSCEAR (2006) model attains the highest of all the Akaike weights of 0.512 and therefore, has the largest contribution to the model-averaged ERR at 1 Sv; as given in the last row of Table 4 and as a function of attained age in Fig. 1. The 95% confidence intervals for the model-averaged ERR at 1 Sv in Table 4 were computed with Method 1 involving Monte-Carlo simulations, but they were also calculated by Method 2 (Eq. 3) and found to be very similar, indicating that 1,000 realizations were adequate. Table 4 also gives the ERR at 1 Sv, for each attained age, age at exposure and time since exposure considered, computed from the parameter values given in Appendix Table 9. It can be seen that the model-averaged values are close to the values from the UNSCEAR (2006) model, due to its high weighting and the similarity of the other models with medium weights to this model. The model-averaged ERR at 1 Sv values for attained ages of 7 and 12 years, given in columns 3–5 of Table 4, are not statistically significant, with the highest central estimate occurring for attained age of 7 years (i.e. at the edge of the corresponding covariable range in the dataset). However, it is interesting to note that statistically significant model-averaged ERR at 1 Sv values are found in Table 4 for attained age 17 and 22 years and that the two risks for attained age of 17 years are almost the same—indicating little difference in the risk at this age between exposure at age 7, 10 years later (model-averaged ERR at 1 Sv with 95% CI is 18.4 (0.6; 35.6)) and exposure at 12, 5 years later (model-averaged ERR at 1 Sv with 95% CI is 19.2 (0.1; 37.9)).
Table 3

The number of parameters, deviance, AIC and model weighting, wi, applied in the model-averaging procedure

Model reference

Number of parameters

Deviance

AIC

wi

EAR model of Preston et al. (2004) (used here as an ERR model)

22

2,258.7

2,302.7

0

BEIR VII, phase 2 (2006)

19

2,255.2

2,293.2

0.001

Richardson et al. (2009) model 2, their Appendix Table A1

21

2,250.1

2,292.1

0.001

Schneider and Walsh (2009) LQ

13

2,258.0

2,284.0

0.069

Schneider and Walsh (2009) LQ-exp

14

2,253.9

2,281.9

0.198

Little et al. (2008)

11

2,259.7

2,281.7

0.219

UNSCEAR (2006)

10

2,260.0

2,280.0

0.512

Richardson et al. (2009) Main model for all types of leukaemia, their Table 3

448

1,915.5

2,811.5

0

Richardson et al. (2009) model 3, their Appendix Table A1

1,086

1,560.3

3,732.3

0

Individual wi values can be interpreted to indicate the probability that the ith model is the best among the nine models considered. The models with wi indicated as zero, actually had very small weights, of less than 0.0005, that have been rounded to zero. All of the models presented in this table were re-fitted to the same dataset, i.e. to DS02CAN.DAT from http://www.rerf.or.jp

Table 4

ERR at 1 Sv with Monte-Carlo simulated 95% CIs for various values of attained age (a) and age at exposure (e) in years

Model reference

a = 55, e = 30, t = 25

a = 7, e = 2, t = 5

a = 12, e = 2, t = 10

a = 12, e = 7, t = 5

a = 17, e = 7, t = 10

a = 17, e = 12, t = 5

a = 22, e = 12, t = 10

EAR model of Preston et al. (2004) (used here as an ERR model)

2.72

27.9

11.3

27.9

11.3

27.9

11.3

1.21; 4.24

8.0; 70.4

4.3; 21.9

8.0; 70.4

4.3; 21.9

8.0; 70.4

4.3; 21.9

BEIR VII, phase 2 (2006)

2.11

92.9

29.5

54.2

19.9

31.6

13.4

−0.09; 2.98

−133.5; 249.5

−24.7; 61.3

−69.7; 126.6

−12.2; 37.9

−27.9; 64.4

−4.4; 22.0

Richardson et al. (2009) model 2, their Appendix Table A1

2.50

103.4

33.4

60.9

22.7

35.9

15.4

−0.10; 3.59

−143.4; 302.0

−23.2; 71.2

−64.8; 147.9

−12.1; 43.8

−22.6; 74.2

−5.0; 26.8

Schneider and Walsh (2009) LQ

2.73

104.0

35.4

38.6

19.2

21.0

12.5

1.60; 4.03

−72.7; 288.8

−7.3; 79.7

−11.0; 88.6

−0.6; 38.8

0.3; 41.6

2.3; 22.8

Schneider and Walsh (2009) LQ-exp

2.95

98.8

34.7

37.9

19.3

21.0

12.8

1.51; 4.06

−40.2; 263.0

−2.4; 75.9

−4.5; 77.8

2.0; 36.5

3.1; 39.5

3.0; 21.4

Little et al. (2008)

2.78

75.9

29.0

34.4

18.5

19.9

12.6

1.53; 4.04

−39.6; 194.2

−7.5; 62.6

−8.3; 77.0

0.2; 35.6

−1.2; 38.1

2.2; 22.3

UNSCEAR (2006)

2.71

74.9

31.4

31.4

17.9

17.9

11.8

1.49; 3.88

−44.0; 190.9

−6.5; 70.0

−6.5; 70.0

0.9; 34.6

0.9; 34.6

2.6; 20.9

Richardson et al. (2009) Main model for all types of leukaemia, their Table 3

2.55

203.5

185.8

93.1

86.4

42.6

40.2

1.12; 4.18

7.8; 4,654.6

16.7; 1,947.5

6.2; 1,149.1

12.1; 582.3

5.2; 295.1

8.1; 182.4

Richardson et al. (2009) model 3, their Appendix Table A1

2.72

212.1

47.9

111.6

31.0

58.6

20.1

−0.46; 3.95

−5,527; 712.1

−56.3; 126.1

−207.3; 331.8

−32.6; 66.6

−76.9; 141.2

−16.9; 39.3

Model-averaged ERR at 1 Sv

2.77

81.9

31.8

33.9

18.4

19.2

12.23

1.43; 3.84*

−40.1; 216.2#

−5.4; 68.9

−6.6; 74.1

0.6; 35.6

0.1; 37.9

2.7; 21.3

t is time since exposure in years. In the model of Preston et al. (2004), e = 30 refers to the age at exposure group 20–39 years, and the other values of e refer to the age at exposure group 0–19 years

*, # independent calculation with the MECAN software (see "Discussion" section) yields 1.49; 3.96* and −64.5; 238#

Table 5

Model average values of ERR (10−2) at 10 mSv and ERR (10−1) at 100 mSv with Monte-Carlo simulated 95% CIs for various values of attained age (a) and age at exposure (e) in years

 

a = 55, e = 30, t = 25

a = 7, e = 2, t = 5

a = 12, e = 2, t = 10

a = 12, e = 7, t = 5

a = 17, e = 7, t = 10

a = 17, e = 12, t = 5

a = 22, e = 12, t = 10

Low-dose limiting ERR/Sv (β) and 95% likelihood CI

1.50

41.5

17.4

17.4

9.9

9.9

6.6

0.19; 3.2

3.9; 258.4

1.8; 73.8

1.8; 73.8

1.1; 33.4

1.1; 33.4

0.8; 18.8

Model-averaged ERR (10−2) at 10 mSv

1.16

32.6

13.2

13.9

7.7

7.9

5.1

−1.56; 2.78

−51.3; 122

−18.7; 41.4

−18.2; 46.3

−11.2; 22.6

−11.0; 24.0

−6.8; 14.7

Model-averaged ERR (10−1) at 100 mSv

1.33

38.0

15.2

16.1

8.9

9.1

5.9

−1.02; 2.92

−46.0; 122.3

−13.3; 43.8

−14.5; 47.2

−8.1; 23.3

−7.7; 24.3

−3.5; 14.7

t is time since exposure in years. Also given for comparison purposes are the best estimates of the low-dose limiting ERR/Sv (fitted parameter β in Table 1) for the model with the greatest Akaike weighting (the UNSCEAR (2006) model)

Model-averaged ERR at 100 and 10 mSv are given in Table 5. It is also theoretically possible to go through the model-averaging procedure for the low-dose limiting ERR/Sv (i.e. in Table 1, the fitted parameters β, βs, θ1–3). However, it was decided to present the low-dose limiting ERR/Sv for the UNSCEAR (2006) model as the best estimate in Table 5 for two reasons: (1) the UNSCEAR (2006) model is associated with a high weighting and (2) the lower likelihood-based 95% confidence bounds for β and βs failed to be numerically found here in four of the models considered (i.e. the BEIR VII phase 2 model, the two modifications of this model in the Appendix Table A1 of Richardson et al. (2009), and the Schneider and Walsh (2009) linear quadratic exponential dose–response model). It can be seen from Table 5 and Fig. 1 (lower panel) that there is good agreement between the low-dose limiting ERR/Sv and the central estimates for the model-averaged ERR at 100 and 10 mSv when scaled to the same dose and that these scaled estimates are all approximately a factor two lower than the ERR at 1 Sv.
Table 6

Main ERR model fitted here

Age at exposure group

Effect

Fit parameters with 95% Wald CI

0–19

*ERR/Sv exp(θ1)

1.710 (0.529; 5.529)

Time since exposure power, τ1

−1.3 (−1.9; −0.7)

20–39

*ERR/Sv exp(θ2)

1.375 (0.445; 4.249)

Time since exposure power, τ2

−0.7 (−1.4; 0.1)

40-

*ERR/Sv exp(θ3)

1.200 (0.362; 3.975)

Time since exposure power, τ3

−0.2 (−1.2; 0.7)

All ages

Sex effect ωs

0.04 (−0.29; 0.38)

All ages

Dose–response curvature θ

0.98 (−0.52; 2.48)

This model was originally fitted and given in the Table 8 of Preston et al. (2004) as an EAR model

* Note that these values are gender-averaged low-dose slopes at 25 years after exposure in a linear quadratic model. The parameter notation refers to Table 1 here and is not exactly the same notation as originally used in Preston et al. (2004)

Discussion

Techniques of multi-model inference, involving model averaging with Akaike type weights, have been applied to nine recently published models fitted to the most recent, generally available, LSS leukaemia mortality data (1950–2000). Leukaemia risks including model uncertainty, in adults aged 55 years and in young persons aged 7, 12, 17 and 22 years, have been obtained. The procedure applied here appeals to Occam’s razor, because it leads to an exclusion of complex models if they do not describe the data significantly better than their simpler counterparts (i.e. in the context of Table 3, models that are “excluded” have very small weights, which have been rounded to 0).

It has been shown that the model-averaged risks of leukaemia are influenced most strongly (51.2% of total weighting) by one model (UNSCEAR 2006) and that three other models, which are similar to this model (Little et al. 2008 and both models of Schneider and Walsh (2009)), have contributed the major part (48.6%) of the remaining weighting. As alluded to above, this is because the remaining five models have included several fitted parameters that were not indicated by the data to be necessary, and this has been numerically penalised in the model-averaging procedure. The model of Richardson et al. 2009, reproduced here in the penultimate lines of Tables 1 and 8 of the "Appendix", has two fitted parameters in the ERR model (see Appendix, Table 7) that are not statistically significant when fitted to the generally available dataset (deviance = 1,889.0, γ and ϕ1, p = 0.42, 0.23, respectively). The fitted values and 90% confidence intervals quoted in the original publication, i.e. γ = −1.06 (−2.81; 0.74) and ϕ1 = −0.20 (−0.50; 0.07) for the special dataset with extra information of leukaemia sub-types, are also not statistically significant. According to calculations done here, with the generally available dataset, it is possible to leave out either γ or ϕ1 from the model (with a reduction in deviance from full model of 0.7 and 1.6, respectively, i.e. less than the critical value from the likelihood ratio test of 3.8 for p = 0.05 for keeping a parameter in a nested model) and so to arrive at two models where all fitted parameters are statistically significant. These two reduced models have some very different characteristics in the central estimate of ERR at 1 Sv for 10 years of age at exposure and for time since exposures of under 20 years, both when compared to each other and to the published sub-optimal model (see the lower panel of Fig. 2 in the "Appendix").
Table 7

Main ERR model given in Table 3 of Richardson et al. (2009) for all types of leukaemia originally fitted to a special dataset with extra information on leukaemia sub-types, but re-fitted here to the publicly available data

Parameter

β

α

ωc

γ

ϕ1

ϕ2

ϕ3*

ϕ5*

Fitted value

1.620

0.926

−0.461

−0.917

−0.214

0.0185

−3.235·10−4

7.428·10−4

95% Wald CI

 Lower

0.1845

0.1742

−0.872

−3.138

−0.562

0.000

−6.144·10−4

9.651·10−5

 Upper

3.056

1.679

−0.050

1.305

0.134

0.037

−3.249·10−5

0.00139

The mathematical form of this model can also be seen in the penultimate lines of Tables 1 and 8 here

* Note that the precision originally given in Richardson et al. (2009) for these parameters was not high enough to either allow an independent reproduction of the original graphics in the Fig. 1 of that paper or to re-compute ERR central estimates at other required ages and times since exposure. The precision given here for these parameters is good enough to overcome these difficulties

https://static-content.springer.com/image/art%3A10.1007%2Fs00411-010-0337-6/MediaObjects/411_2010_337_Fig2_HTML.gif
Fig. 2

The upper panel shows a repeat of Fig. 1 in Richardson et al. 2009 (originally for the special dataset, with extra information of leukaemia sub-types) but fitted here to the publicly available leukaemia mortality data (1950–2000), with fitted model parameters from Table 7. The lower panel shows the same three lines as in the upper panel on a different scale and, where the main differences exist, the characteristics of the time since exposure effect obtained with optimal model 1 (i.e. with the ϕ1 term omitted from the full model in the penultimate line of Table 1) and optimal model 2 (i.e. with the γ term from this full model omitted)

Similarly, the model of Preston et al. (2004), which was originally published as an EAR model and has been re-fitted here as an ERR model with the same parametric form, had several parameters that were not statistically significant (e.g. baseline parameters β9, β10, β11, β12, as in Appendix Table 8) both in the original EAR model and in the analogous ERR model described here. Since Preston et al. (2004) preferred to publish an EAR rather than an ERR model, because the EAR fitted the data somewhat better, it could be argued that the weights and the ERR estimates, which went into the multi-model inferences here, should have been calculated from the original model. However, the degree of model improvement is only small: where the EAR had a deviance of 2,254.8 (value quoted from http://www.rerf.or.jp, filename:DS02can.log), the analogous ERR model presented here had a deviance of 2,258.7; ΔAIC = 3.9. In such a pair-wise comparison, this latter value is smaller than a reference value of 5.9 for ΔAIC, which indicates that the model with the smaller AIC has a 95% chance of being correct (see Walsh 2007, and references therein, for an explanation).
Table 8

Forms of baseline models applied to account for the spontaneous leukaemia mortality in the ERR models considered in the model-averaging procedure

Model reference

Form of baseline model ln{λ0(a,e,s,c)} =

EAR model of Preston et al. (2004) (used here as an ERR model)

β1–4 (s.c) + β5–6s.ln(a/70) + β7–8s.ln2 (a/70) + β9–10s.max2 (0, ln (a/70)) + β11–12s.(e−30) + β13–14s.(e−30)2

BEIR VII, phase 2 (2006)

β0 + β1s + β2c + β3–4s.ln(a/70) + β5–6s.ln2 (a/70) + β7–8s.max2 (0, ln (a/70)) + β9–10s.(e−30) + β11–12s.(e−30)2

Richardson et al. (2009) model 2, their Appendix Table A1

β0 + β1s + β2–5 (c.l) + β6–7s.ln(a/70) + β8–9s.ln2 (a/70) + β10–11s.max2 (0, ln (a/70)) + β12–13s.(e−30) + β14–15s.(e−30)2

Schneider and Walsh (2009) (both models)

β1–4 (s.c) + β5 ln(a/70) + β6 ln2 (a/70) +β7 (e−30) + β8 (e−30)2

Little et al. (2008) and UNSCEAR (2006)

β0 + β1s + β2c + β3 ln(a/70) + β4 ln2 (a/70) +β5(e−30) + β6(e−30)2

Richardson et al. (2009) main model for all types of leukaemia, their Table 3

Stratification on categories of s, c, a, b and l

Richardson et al. (2009) model 3, their Appendix Table A1

Stratification on categories of s, c, a, e and l

β1–14 are fit parameters, a is attained age, e is age at exposure (applied here as a proxy variable for birth cohort, because exposure was momentary and at the same point in calendar time), c is a two-level city indicator variable for Hiroshima and Nagasaki, s is a gender indicator variable, and l is a two-level indicator variable for survivor location from the bomb hypocentre, i.e. proximal or distal, b, indexes birth cohort (<1895, 1895–1904, 1905–1914, 1915–1924, 1925–1945)

None of the models that included gender-specific parameters, either in the baseline or in ERR model parts, indicated any statistically significant gender effects. This would indicate that it is not necessary to eliminate the female data and in so doing accept the resulting wider confidence intervals, when computing leukaemia risks from the A-bomb data for the purpose of comparison with cohorts of male nuclear workers (e.g. as in Cardis et al. (2005)). Cardis et al. 2005 quoted an ERR/Sv based on the linear term of a linear quadratic dose–response model (i.e. the low-dose limiting ERR/Sv), for males aged 20–60, based on 83 leukaemia deaths excluding CLL, in the A-bomb cohort of 1.54 (95% CI −1.14 to 5.33). Cardis et al. 2005 also quoted a linear ERR/Sv for the same sub-group of 3.15 (95% CI 1.58 to 5.67) that is nevertheless statistically consistent with the model-averaged ERR at 1 Sv for adults aged 55 of 2.77 (95% CI 1.43; 3.84) found here.

A recent meta-analysis, which included results from 10 nuclear workers studies and adjusted for publication bias (Daniels and Schubauer-Berigan 2010, see also the editorial to this paper, Walsh 2010), found an ERR at 100 mGy of 0.19 (95% CI: 0.07; 0.32). They also used the linear term of a linear quadratic model as above (ERR at 100 mGy = 0.15 (95% CI −0.11 to 0.53)) for comparison with the A-bomb risks and conclude that the A-bomb estimates are not precise. However, the best comparison for this meta-analysis provided here by Table 5 (ERR at 100 mGy = 0.13 (95% CI −0.10 to 0.29)) is somewhat less uncertain even though model uncertainty is included. When applying leukaemia models derived from the LSS to nuclear worker studies, it should be considered that the exposure situations are very different, i.e. an almost instantaneous exposure compared to extended cumulative exposures; however, the good agreement in central risk estimates here could be taken as an indication of risk equivalence between protracted and acute exposures at the same dose.

In order to evaluate which time or age-dependent covariables are most important for the risk modification, it is instructive to consider the one model (UNSCEAR 2006) that most strongly influenced the model-averaged risks of leukaemia. Little et al. 2009 assessed the model selection for this (UNSCEAR 2006) model in terms of all combinations of time or age-dependent covariables and concluded that attained age was more clearly indicated than age at exposure or time since exposure, or any two of these variables. Since only effect modification in terms of a power functional form was originally considered, both a linear and an exponential for the attained age modification were computed here, but were both found to fit the data less well (power function (deviance = 2,260.0; and reference model for ΔAIC computation); exponential function (deviance = 2,263.9; ΔAIC = 3.9); linear function (deviance = 2,266.6; ΔAIC = 6.6)). However, since the mean marrow dose of the four cases in the youngest age group available corresponding to 5 to <10 years was 1.85 Sv with a very high lower value of the dose range of 1.31 Sv, all the risk models considered here could have suffered from an ill-defined baseline risk at young attained ages under 10 years. Consequently, the preferred functional form for age (i.e. power function) could be an artefact caused by the total lack of data for low exposures at ages under 10 years. In order to check this supposition, the same sequence of models was fitted to the data for attained age from 10 years. Again the linear and exponential function for the attained age modification were found to fit the data less well (power function (deviance = 2,231.2); exponential function (deviance = 2,233.8; ΔAIC = 2.6); linear function (deviance = 2,235.2; ΔAIC = 4.0)).

The model-averaging procedure applied here may not have treated the models with stratified baseline in the best possible way. Stratified baseline models are important because they may account for confounding factors in an indirect way or provide an alternative approach to parametric background modelling for treating highly collinear covariables (e.g. see Walsh et al. 2009). An adaptation of the method considered here could be to apply model averaging to the group of models with parametric baselines, and then separately to the group of models with stratified baselines, finally combining the two model-averaged estimates. However, at this stage, the theory necessary to do this has not been published and requires further work and development.

A further point, which has not received attention in the past, is the influence of the uncertainties in the baseline model parameters on the ERR parameters. Most radiation risks and uncertainties for various organs based on the A-bomb data have usually been identified with their linear dose–response fitted parameter (e.g. Preston et al. 2003). However, there is now new software developed by one of the authors (JCK) called MECAN that can account for uncertainties in parametric baseline parameters and is available on request. Table 4 also includes some examples of independent calculations of 95% confidence intervals that include parametric baseline uncertainties with the MECAN software. However, this method has not been applied globally in the model-averaging procedure here, for reasons of consistency, because it is not yet sufficiently well developed to deal with stratified baselines.

Results presented here for the model-averaged ERR at 1 Sv, which imply an increase with decreasing attained age, are not statistically significant for attained age 7 and 12 years, but are statistically significant for attained age 17 and 22. Given such results for the ERR at 1 Sv for young ages, which is a much higher radiation dose than natural background radiation (about 0.5–2.5 mSv per year), it can be assumed that any extrapolations of the ERR for children from high doses to the low doses relevant to background radiation are associated with even higher uncertainties. These uncertainties should be built into calculations of the percentage of childhood leukaemia incidence attributable to background radiation (which involve estimates of ERR and EAR), such as those recently presented for the United Kingdom (Wakeford et al. 2009; Little et al. 2009). If this model uncertainty in ERR (or EAR) is explicitly built into such calculations, then the result could range from 0% to some much higher percentage, and thus it is difficult to see how the result of 15–20% (Wakeford et al. 2009; Little et al. 2009) could be statistically significant. Although these authors (Wakeford et al. 2009; Little et al. 2009) state that “the uncertainty associated with certain stages in the calculation is significant”, no confidence intervals, for the 15–20% quoted, were given.

There are difficulties similar to those just mentioned, in extrapolating childhood leukaemia risks from the A-bomb data to obtain risks relevant to children under 5 years old living in the vicinity of nuclear power stations. Such children may have received annual doses in the order of a few μSv, in addition to the natural background radiation of 0.5–2.5 mSv. Assuming a total radiation dose to a 5-year-old of 10 mSv and applying the model-averaged risk at 10 mSv for a 7-year-old exposed at 2 years of age (found here from Table 5) would result in an ERR = 0.33, 95% CI: −0.51–1.22, which is a risk associated with large uncertainties. Generally, the greatest uncertainty in extrapolating from A-bomb data to other populations is associated with the children under the age of 10 years when exposed to the atomic bombs, 90% of whom are still alive (Preston et al. 2004). However, the follow-up relevant to the childhood leukaemia risks considered here was completed a long time ago.

Conclusion

Nine recently published leukaemia models have been included in a procedure, which has concentrated on applying model averaging, so that the main conclusions drawn from model selection do not depend on just one type of statistical test, which could be associated with stringent assumptions (e.g. nested models). This procedure led to an exclusion of complex models less supported by the data than their simpler counterparts. One model (UNSCEAR 2006) weighted model-averaged risks of leukaemia most strongly by half of the total unity weighting. Results presented here for the model-averaged ERR at 1 Sv are not statistically significant for attained ages of 7 and 12 years, but are statistically significant for attained ages of 17, 22 and 55 years. The most important risk-modifying factor implied is attained age, with a power functional increase in risk with decreasing age. Since the model-averaged ERR at 1 Sv at attained age 7 and 12 years are not statistically significant, risks applied to low doses for young ages and short times since exposure that are based on the A-bomb data are only of limited value. Consequently, such risks when applied to other situations such as children in the vicinity of nuclear installations, or in estimates of the proportion of childhood leukaemia incidence attributable to background radiation, should include a full discussion of confidence intervals that will be very wide and include zero risk. One model (UNSCEAR 2006) weighted model-averaged risks of leukaemia most strongly by half of the total unity weighting and is recommended for application in future leukaemia risk assessments that do not include model uncertainty. However, on the basis of the analysis presented here, it is generally recommended to take model uncertainty into account in future risk analyses.

The authors have attempted to bring some relatively new issues that augment the usual practice of ignoring model uncertainty when making inference about parameters of a specific model, to the attention of the radiation protection community. The application here of the Akaike weights is considered to be the simplest approach, mainly because it by-passes the need to specify Bayesian priors by assuming that all models are a priori equally likely. Although there is no doubt that model averaging using the Akaike weights can reduce bias due to model selection, the authors and one of the reviewers are not aware of any optimality theory concerning choice of model-averaging weights. A full Bayesian treatment may be more suitable for the present problem, given that the candidate models were developed with differing scientific goals in mind.

The authors would be very pleased if their initial efforts encourage more detailed analyses and more theoretical papers on model averaging in radiation protection, by much larger teams of experts. New methods for combining multi-model inferences with model predictive capability would also be very useful for radiation protection. The authors do not believe that this paper provides a benchmark for multi-model inference in radiation protection, it is merely a start in the right direction—there is still a lot of work to be done.

Acknowledgments

The authors would like to thank Prof. D. Pierce and Dr. P. Jacob for useful discussions and Dr. J. R. Walsh for critically reading the manuscript. Special thanks are due to Prof. W. Rühm for his constructive advice. This work makes use of the data obtained from the Radiation Effects Research Foundation (RERF) in Hiroshima, Japan. RERF is a private foundation funded equally by the Japanese Ministry of Health and Welfare and the US Department of Energy through the US National Academy of Sciences. The conclusions in this work are those of the author and do not necessarily reflect the scientific judgement of RERF or its funding agencies.

Copyright information

© Springer-Verlag 2010