Skip to main content
Log in

A short review of model selection techniques for radiation epidemiology

  • Original Paper
  • Published:
Radiation and Environmental Biophysics Aims and scope Submit manuscript

Abstract

A common type of statistical challenge, widespread across many areas of research, involves the selection of a preferred model to describe the main features and trends in a particular data set. The objective of model selection is to balance the quality of fit to data against the complexity and predictive ability of the model achieving that fit. Several model selection techniques, including two information criteria, which aim to determine which set of model parameters the data best support, are reviewed here. The techniques rely on computing the probabilities of the different models, given the data, rather than considering the allowed values of the fitted parameters. Such information criteria have only been applied to the field of radiation epidemiology recently, even though they have longer traditions of application in other areas of research. The purpose of this review is to make two information criteria more accessible by fully detailing how to calculate them in a practical way and how to interpret the resulting values. This aim is supported with the aid of some examples involving the computation of risk models for radiation-induced solid cancer mortality fitted to the epidemiological data from the Japanese A-bomb survivors. These examples illustrate that the Bayesian information criterion is particularly useful in concluding that the weight of evidence is in favour of excess relative risk models that depend on age-at-exposure and excess relative risk models that depend on age-attained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Notes

  1. However, this should not give the impression that the standard model selection approach involving maximum likelihoods pays no attention to the number of fit parameters, which, in fact, determines the number of degrees of freedom, as explained below.

References

  1. Burnham KP, Anderson DR (2002) Model selection and multimodel inference. 2nd edn. Springer, New York

    MATH  Google Scholar 

  2. MacKay DJC (2003) Information theory, inference and learning algorithms. Cambridge University Press, London

    MATH  Google Scholar 

  3. Gregory P (2005) Bayesian logical data analysis for the physical sciences. Cambridge University Press, London

    MATH  Google Scholar 

  4. Neyman J, Pearson ES (1928) On the use and interpretation of certain test criteria for purposes of statistical inference, part II. Biometrika 20A:263–294

    Google Scholar 

  5. Harrell FE Jr (2001) Regression modeling strategies: with applications to linear models, logistic regression and survival analysis. Springer Series in Statistics

  6. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Petrov BN, Caski F (eds) Proceedings of the 2nd international symposium on information theory. Budapest, Hungary, Akademiai Kiado, pp 267–281

  7. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723

    Article  MATH  ADS  MathSciNet  Google Scholar 

  8. Schwarz G (1978) Estimating the dimension of a model. Ann stat 6:461–464

    MATH  Google Scholar 

  9. Walsh L, Rühm W, Kellerer AM (2004) Cancer risk estimates for γ-rays with regard to organ specific doses, part I: All solid cancers combined. Radiat Environ Biophys 43:145–151

    Article  Google Scholar 

  10. Izumi S, Ohtaki M (2004) Aspects of the Armitage–Doll gamma frailty model for cancer incidence data. Environmetrics 15:209–218

    Article  Google Scholar 

  11. Tavecchia G, Pradel R, Boy V, Johnson AR, Cezilly F (2001) Sex- and age-related variation in survival and cost of reproduction in greater flamingos. Ecology 82(1):165–174

    Article  Google Scholar 

  12. Mukherjee S, Feigelson ED, Babu GL, Murtagh F, Fraley C, Raftery A (1998) Three types of gamma-ray bursts. Ap J 508:314–325

    Article  ADS  Google Scholar 

  13. Preston DL, Shimizu Y, Pierce DA, Suyama A, Mabuchi K (2003) Studies of the mortality of atomic bomb survivors. Report 13 solid cancer and noncancer disease mortality1950–1997. Radiat Res 160:381–407

    Article  Google Scholar 

  14. Sakamoto Y, Ishiguro M, Kitagawa G (1986) Akaike information criterion statistics. Kluwer Academic, Dordrecht

    MATH  Google Scholar 

  15. Yang Y (2005) Can the strengths of AIC and BIC be shared? A conflict between model identification and regression estimation. Biometrika 92:937–950

    Article  MATH  MathSciNet  Google Scholar 

  16. Motulsky H, Christopoulos A (2002) Fitting models to biological data using linear and nonlinear regression. A practical guide to curve fitting. GraphPad Software, Inc.

  17. Jeffreys H (1935) Some tests of significance, treated by the theory of probability. Proc Camb Philo Soc 31:203–222

    MATH  Google Scholar 

  18. Jeffreys H (1961) Theory of probability, 3rd edn. Oxford University Press, Oxford

    MATH  Google Scholar 

  19. Radivoyevitch T, Hoel DG (2000) Biologically-based risk estimation for radiation-induced chronic myeloid leukemia. Radiat Environ Biophys 39:153–159

    Article  Google Scholar 

  20. Kass RE, Raftery AE (1995) Bayes factors. J Am Stat Assn 90:773–795

    Article  MATH  Google Scholar 

  21. Kashyap R (1980) Inconsistency of the AIC rule for estimating the order of autoregressive models. IEEE Trans Auto Control 25:996–998

    Article  MATH  MathSciNet  Google Scholar 

  22. Mallows CL (1973) Some Comments on C p . Technometrics 15(4):661–675

    Article  MATH  Google Scholar 

  23. Kolmogorov A (1968) Three approaches to the quantitative definition of information. Probl Inf Transmission 1:1–12

    MathSciNet  Google Scholar 

  24. Ramos AA (2006) The minimum description length principle and model selection in spectropolarimetry. Online under arXiv:astro-ph/0606516 v1 21 June 2006

  25. Rissanen J (1986) Stochastic complexity and modeling. Ann Stat 14(3):1080–1100

    MATH  MathSciNet  Google Scholar 

  26. Rissanen J (1978) Modeling by shortest data description. Automatica 14:465–471

    Article  MATH  Google Scholar 

  27. Spiegelhalter DJ, Best NG, Carlin BP, van der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc B 64, part 4, 583–639

    Google Scholar 

  28. Bennett B (2003) DS02: The new dosimetry system DS02. Hiroshima Igaku (Japanese). J Hiroshima Med Assoc 56:386

    Google Scholar 

  29. Young R, Kerr GD (eds) (2005) DS02: Reassessment of the atomic bomb radiation dosimetry for Hiroshima and Nagasaki, Dosimetry System 2002, DS02, vols 1, 2, Radiation Effects Research Foundation, Hiroshima

  30. Straume T, Rugel G, Marchetti AA, Rühm W, Korschinek G, McAninch JE, Carroll K, Egbert S, Faestermann T, Knie K, Martinelli R, Wallner A, Wallner C, Fujita S, Shizuma K, Hoshi M, Hasai H (2003) Measuring fast neutrons in Hiroshima at distances relevant to atomic-bomb survivors. Nature 424:539–541

    Google Scholar 

  31. Straume T, Rugel G, Marchetti AA, Rühm W, Korschinek G, McAninch JE, Carroll K, Egbert S, Faestermann T, Knie K, Martinelli R, Wallner A, Wallner C, Fujita S, Shizuma K, Hoshi M, Hasai H (2004) Measuring fast neutrons in Hiroshima at distances relevant to atomic-bomb survivors. Nature 430:483

    Google Scholar 

  32. Huber T, Rühm W, Hoshi M, Egbert SD, Nolte E (2003) 36Cl measurements in Hiroshima granite samples as part of an international intercomparison study: results from the Munich group. Radiat Environ Biophys 42:27–32

    Article  Google Scholar 

  33. Huber T, Rühm W, Kato K, Egbert S, Kubo F, Lazarev V, Nolte E (2005) The Hiroshima thermal neutron discrepancy for 36Cl at large distances; Part I: New 36Cl measurements in granite samples exposed to a-bomb neutrons. Radiat Environ Biophys 44:75–86

    Article  Google Scholar 

  34. Kellerer AM, Walsh L (2001) Risk estimation for fast neutrons with regard to solid cancer. Radiat Res 156:708–717

    Article  Google Scholar 

  35. Kellerer AM, Barclay D (1992) Age dependences in the modelling of radiation carcinogenesis: age-dependent factors in the biokinetics and dosimetry of radionuclides. Radiat Prot Dosim 41:273–281

    Google Scholar 

  36. Pierce DA, Mendelsohn ML (1999) A model for radiation related cancer suggested by atomic bomb survivor data. Radiat Res 152:642–654

    Article  Google Scholar 

  37. James F (1994) Minuit function minimization and error analysis, Version 94.1. Technical report, CERN

  38. Preston DL, Lubin JH, Pierce DA (1993) Epicure User`s Guide. HiroSoft International Corp., Seattle

  39. Lagarde F (2006) Understanding estimation of time and age effect-modification of radiation-induced cancer risk among atomic-bomb survivors. Health Phys 91(6):608–618

    Article  Google Scholar 

  40. Box GEP (1976) Science and statistics. J Am Stat Assoc 71:791–799

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

The author would like to thank Dr. W. Rühm and Dr. J. R. Walsh for critically reading the manuscript, Prof. D. Pierce and Dr. P. Jacob for useful discussions and two anonymous reviewers for many valuable comments which lead to an improvement of the original manuscript. This work makes use of the data obtained from the Radiation Effects Research Foundation (RERF) in Hiroshima, Japan. RERF is a private foundation funded equally by the Japanese Ministry of Health and Welfare and the US Department of Energy through the US National Academy of Sciences. The conclusions in this work are those of the author and do not necessarily reflect the scientific judgement of RERF or its funding agencies.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Linda Walsh.

Appendix

Appendix

Table 6

Table 6 Fit parameters [with standard errors (SE)] for the four preferred models in Table 5 as defined by Eq. 610

Rights and permissions

Reprints and permissions

About this article

Cite this article

Walsh, L. A short review of model selection techniques for radiation epidemiology. Radiat Environ Biophys 46, 205–213 (2007). https://doi.org/10.1007/s00411-007-0109-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00411-007-0109-0

Keywords

Navigation