Multistage Carcinogenesis: A Unified Framework for Cancer Data Analysis

  • Suresh Moolgavkar
  • Georg Luebeck


Traditional approaches to the analysis of epidemiologic data are focused on estimation of the relative risk and are based on the proportional hazards model. Proportionality of hazards in epidemiologic data is a strong assumption that is often violated but seldom checked. Risk often depends on detailed patterns of exposure to environmental agents, but detailed exposure histories are difficult to incorporate in the traditional approaches to analyses of epidemiologic data. For epidemiologic data on cancer, an alternative approach to analysis can be based on ideas of multistage carcinogenesis. The process of carcinogenesis is characterized by mutation accumulation and clonal expansion of partially altered cells on the pathway to cancer. Although this paradigm is now firmly established, most epidemiologic studies of cancer incorporate ideas of multistage carcinogenesis neither in their design nor in their analyses. In this paper we will briefly discuss stochastic multistage models of carcinogenesis and the construction of the appropriate likelihoods for analyses of epidemiologic data using these models. Statistical analyses based on multistage models can quite explicitly incorporate detailed exposure histories in the construction of the likelihood. We will give examples to show that using ideas of multistage carcinogenesis can help reconcile seemingly contradictory findings, and yield insights into epidemiologic studies of cancer that would be difficult or impossible to get from conventional methods. Finally, multistage cancer models provide a unified framework for analyses of data from diverse sources.


Multistage carcinogenesis Stochastic models Clonal expansion model Proportional hazards model Colon cancer Screening Folate supplementation 


  1. 1.
    Armitage, P., & Doll, R. (1954). The age distribution of cancer and a multi-stage theory of carcinogenesis. British Journal of Cancer, 8, 1–12.CrossRefGoogle Scholar
  2. 2.
    Berenblum, I., & Shubik, P. (1947). A new, quantitative approach to the study of the stages of chemical carcinogenesis in the mouse’s skin. British Journal of Cancer, 1, 383–391.CrossRefGoogle Scholar
  3. 3.
    Berman, D. W., & Crump, K. S. (2008). Update of potency factors for asbestos-related lung cancer and mesothelioma. Critical Reviews in Toxicology, 38(Suppl 1), 1–47.CrossRefGoogle Scholar
  4. 4.
    Bozic, I., Antal, T., Ohtsuki, H., Carter, H., Kim, D., Chen, S., et al. (2010). Accumulation of driver and passenger mutations during tumor progression. Proceedings of the National Academy of Sciences, 107(43), 18545–18550.CrossRefGoogle Scholar
  5. 5.
    Breslow, N. (1974). Covariance analysis of censored survival data. Biometrics, 30(1), 89–99.CrossRefMathSciNetGoogle Scholar
  6. 6.
    Brouwer, A. F., Meza, R., & Eisenberg, M. C. (2017). Parameter estimation for multistage clonal expansion models from cancer incidence data: A practical identifiability analysis. PLoS Computational Biology, 13(3), e1005431–18.CrossRefGoogle Scholar
  7. 7.
    Brown, C. C., & Chu, K. C. (1983). Implications of the multistage theory of carcinogenesis applied to occupational arsenic exposure. Journal of the National Cancer Institute, 70, 455–463.Google Scholar
  8. 8.
    Burns, D. M., Shanks, T. G., Choi, W., Thun, M. J., Heath, C. W., & Garfinkel, L. (1997). The American Cancer Society Cancer Prevention Study I: 12-year followup of 1 million men and women. In D. M. Burns, L. Garfinkel, & J. M. Samet (Eds.) Smoking and tobacco control, monograph 8 (pp. 113–304). NIH Publ. No 97-4213.Google Scholar
  9. 9.
    Cole, B. F., Baron, J. A., Sandler, R. S., Haile, R. W., Ahnen, D. J., Bresalier, R. S., et al. (2007). Folic acid for prevention of colorectal adenomas. A randomized clinical trial. JAMA : The Journal of the American Medical Association, 297, 2351–2359.CrossRefGoogle Scholar
  10. 10.
    Cox, D. R. (1972). Regression models and life tables (with discussion). Journal of the Royal Statistical Society, Series B: Statistical Methodology, 34, 187–220.zbMATHGoogle Scholar
  11. 11.
    Crump, K. S., Chen, C., Fox, J. F., Van Landingham, C., & Subramaniam, R. (2008). Sensitivity analysis of biologically motivated model for formaldehyde-induced respiratory cancer in humans. The Annals of Occupational Hygiene, 52, 481–95.Google Scholar
  12. 12.
    Crump, K. S., Subramaniam, R. P., & Landigham, C. B. (2005). A numerical solution to the nonhomogeneous two-stage MVK model of cancer. Risk Analysis, 25, 921–926.CrossRefGoogle Scholar
  13. 13.
    Day, N. E., & Brown, C. C. (1980). Multistage models and the primary prevention of cancer. Journal of the National Cancer Institute, 64, 977–989.Google Scholar
  14. 14.
    Dewanji, A., Jeon, J., Meza, R., & Luebeck, E. G. (2011). Number and size distribution of colorectal adenomas under the multistage clonal expansion model of cancer. PLoS Computational Biology, 7(10), e1002213.CrossRefMathSciNetGoogle Scholar
  15. 15.
    Dewanji, A., Venzon, D. J., & Moolgavkar, S. H. (1989). A stochastic two-stage model for cancer risk assessment II: The number and size of premalignant clones. Risk Analysis, 9, 179–186.CrossRefGoogle Scholar
  16. 16.
    Doll, R., & Peto, R. (1978). Cigarette smoking and bronchial carcinoma: Dose and time relationships among regular smokers and life-long non-smokers. Journal of Epidemiology and Community Health, 32, 303–313.CrossRefGoogle Scholar
  17. 17.
    Efron, B., & Morris, C. (1977). Comment on A Simulation Study of Alternative to Least Squares, by H. Clark and T. Schwisow. The American Statistician, 72, 102–109.Google Scholar
  18. 18.
    Hanin, L. G., & Yakovlev, A. Y. (1996). A nonidentifiability aspect of the two-stage model of carcinogenesis. Risk Analysis, 16, 711–715.CrossRefGoogle Scholar
  19. 19.
    Hazelton, W. D., Clements, M. S., & Moolgavkar, S. H. (2005). Multistage carcinogenesis and lung cancer mortality in three cohorts. Cancer Epidemiology, Biomarkers & Prevention, 14, 1171–1181.CrossRefGoogle Scholar
  20. 20.
    Hazelton, W. D., Luebeck, E. G., Heidenreich, W. F., & Moolgavkar, S. H., (2001). Analysis of a historical cohort of Chinese tin miners with arsenic, radon, cigarette smoke, and pipe smoke exposures using the biologically based two-stage clonal expansion model. Radiation Research, 156(1), 78–94.CrossRefGoogle Scholar
  21. 21.
    Heidenreich, W. (1996). On the parameters of the clonal expansion model. Radiation and Environmental Biophysics, 35, 127–129.CrossRefGoogle Scholar
  22. 22.
    Heidenreich, W., Luebeck, E. G., & Moolgavkar, S. H. (1997). Some properties of the hazard function of the two-mutation clonal expansion model. Risk Analysis, 17, 391–399.CrossRefGoogle Scholar
  23. 23.
    Heidenreich, W., Wellmann, J., Jacob, P., & Wichmann, H. E. (2002). Mechanistic modelling in large case-control studies of lung cancer risk from smoking. Statistics in Medicine, 21, 3055–3070.CrossRefGoogle Scholar
  24. 24.
    Hethcote, H. W., & Knudson, A. G. (1978). Model for the incidence of embryonal cancer: application to retinoblastoma. Proceedings of the National Academy of Sciences of the United States of America, 75, 2453–2457.Google Scholar
  25. 25.
    Holford, T. R. (1991). Understanding the effects of age, period and cohort on incidence and mortality rates. Annual Review of Public Health, 12, 425–457.CrossRefGoogle Scholar
  26. 26.
    Holford, T. R., Zhang, Z., McKay, L. A. (1994). Estimating age, period and cohort effects using the multistage model for cancer. Statistics in Medicine, 13, 23–41.CrossRefGoogle Scholar
  27. 27.
    Howard, S. (1972). Contribution to the discussion of a paper by DR Cox: Regression models and life-tables. Journal of the Royal Statistical Society. Series B, 34, 210–211.Google Scholar
  28. 28.
    Jeon, J., Luebeck, E. G., & Moolgavkar, S. H. (2006). Age effects and temporal trends in adenocarcinoma of esophagus and gastric cardia. Cancer Causes Control, 17, 971–981.CrossRefGoogle Scholar
  29. 29.
    Jeon, J., Meza, R., Hazelton, W. D., Renehan, A. G., & Luebeck, E. G. (2015). Incremental benefits of screening colonoscopy over sigmoidoscopy in average-risk populations: a model-driven analysis. Cancer Causes & Control, 26(6), 859–870.CrossRefGoogle Scholar
  30. 30.
    Jeon, J., Meza, R., Moolgavkar, S. H., & Luebeck, E. G. (2008). The evaluation of cancer screening strategies using multistage carcinogenesis models. Mathematical Biosciences, 213, 56–70.CrossRefMathSciNetzbMATHGoogle Scholar
  31. 31.
    Jones, S., Chen, W., Parmigiani, G., Diehl, F., Beerenwinkel, N., Antal, T., et al. (2008). Comparative lesion sequencing provides insights into tumor evolution. Proceedings of the National Academy of Sciences, 105(11), 4283–4288.CrossRefGoogle Scholar
  32. 32.
    Knudson, A. G. (2001). Two genetic hits (more or less) to cancer. Nature Reviews Cancer, 1(2), 157.CrossRefGoogle Scholar
  33. 33.
    Knudson, A. G., Hethcote, H. W., & Brown, B. W. (1975). Mutation and childhood cancer: A probabilistic model for the incidence of retinoblastoma. Proceedings of the National Academy of Sciences USA, 72, 5116–5120.Google Scholar
  34. 34.
    Langholz, B. (2010). Case-control studies =  odds ratios. Blame the retrospective model. Epidemiology, 21, 10–12.CrossRefGoogle Scholar
  35. 35.
    Little, M. P. (1995). Are two mutations sufficient to cause cancer? Some generalizations of the two mutation model of carcinogenesis of Moolgavkar, Venzon and Knudson, and of the multistage model of Armitage and Doll. Biometrics, 51, 1278–1291.CrossRefzbMATHGoogle Scholar
  36. 36.
    Little, M. P. (2010). Cancer models, genomic instability and somatic cellular Darwinian evolution. Biology Direct, 5, 19.CrossRefGoogle Scholar
  37. 37.
    Little, M. P., Heidenreich, W. F., & Li, G. (2009). Parameter identifiability and redundancy in a general class of stochastic carcinogenesis models. PLoS One, 4(12), e8520.CrossRefGoogle Scholar
  38. 38.
    Luebeck, E. G., Buchmann, A., Stinchcombe, S., Moolgavkar, S. H., & Schwarz, M. (2000). Effects of 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) on initiation and promotion of GSTP-positive foci in rat liver: A quantitative analysis of experimental data using a stochastic model. Toxicology and Applied Pharmacology, 167, 63–73.CrossRefGoogle Scholar
  39. 39.
    Luebeck, E. G., Curtis, S. B., Cross, F. T., & Moolgavkar, S. H. (1996). Two-stage model of radon-induced malignant lung tumors in rats: effects of cell killing. Radiation Research, 145(2), 163–173.CrossRefGoogle Scholar
  40. 40.
    Luebeck, E. G., Curtius, K., Jeon, J., & Hazelton, W. D. (2013). Impact of tumor progression on cancer incidence curves. Cancer Research, 73(3), 2198.CrossRefGoogle Scholar
  41. 41.
    Luebeck, E. G., & Moolgavkar, S. H. (2002). Multistage carcinogenesis and the incidence of colorectal cancer. Proceedings of the National Academy of Sciences of the United States of America, 99, 15095–15100.Google Scholar
  42. 42.
    Luebeck, E. G., Moolgavkar, S. H., Buchmann, A., & Schwarz, M. (1991). Effects of polychlorinated biphenyls in rat liver: Quantitative analysis of enzyme-altered foci. Toxicology and Applied Pharmacology, 111(3), 469–484.CrossRefGoogle Scholar
  43. 43.
    Luebeck, E. G., Moolgavkar, S. H., Liu, A., & Ulrich, N. (2008). Does folic acid supplementation prevent or promote colon cancer? Results from model-based predictions. Cancer Epidemiology, Biomarkers & Prevention, 17, 1360–1367.CrossRefGoogle Scholar
  44. 44.
    Mason, J. B., Dickstein, A., Jacques, P. F., Haggarty, P., Selhub, J., Dallal, G. et al. (2007). A temporal association between folic acid fortification and an increase in colorectal cancer rates may be illuminating important biological principles: a hypothesis. Cancer Epidemiology, Biomarkers & Prevention, 16, 1325–1329.CrossRefGoogle Scholar
  45. 45.
    Meza, R., Hazelton, W. D., Colditz, G. A., & Moolgavkar, S. H. (2008a). Analysis of lung cancer incidence in the nurses’ health and the health professionals’ follow-up studies using a multistage carcinogenesis model. Cancer Causes Control, 19, 317–328.CrossRefGoogle Scholar
  46. 46.
    Meza, R., Jeon, J., Moolgavkar, S. H., & Luebeck, E. G. (2008b). The age-specific incidence of cancer: phases, transitions and biological implications. Proceedings of the National Academy of Sciences of the United States of America, 105, 16284–16289.Google Scholar
  47. 47.
    Moolgavkar, S. H. (1995). When and how to combine results from multiple epidemiological studies in risk assessment. In J. Graham (ed.) The proper role of epidemiology in regulatory risk assessment (pp. 77–90). New York: Elsevier.Google Scholar
  48. 48.
    Moolgavkar, S. H., Chang, E. T., Watson, H. N., & Lau, E. C. (2018). An assessment of the cox proportional hazards regression model for epidemiologic studies. Risk Analysis, 38(4), 777–794.CrossRefGoogle Scholar
  49. 49.
    Moolgavkar, S. H., Day, N. E., & Stevens, R. G. (1980). Two-stage model for carcinogenesis: Epidemiology of breast cancer in females. Journal of the National Cancer Institute, 65, 559–569.Google Scholar
  50. 50.
    Moolgavkar, S. H., Krewski, D., & Schwarz, M. (1999). Mechanisms of carcinogenesis and biologically-based models for quantitative estimation and prediction of cancer risk. In S. H. Moolgavkar, D. Krewski, L. Zeise, E. Cardis, & H. Moller (Eds.) Quantitative estimation and prediction of cancer risk (pp. 179–238). Lyon: IARC Scientific Publications.Google Scholar
  51. 51.
    Moolgavkar, S. H., & Luebeck, G. (1990). Two-event model for carcinogenesis: Biological, mathematical and statistical considerations. Risk Analysis, 10, 323–341.CrossRefGoogle Scholar
  52. 52.
    Moolgavkar, S. H., & Luebeck, E. G. (1992). Multistage carcinogenesis: A population-based model for colon cancer. Journal of the National Cancer Institute, 84, 610–618.CrossRefGoogle Scholar
  53. 53.
    Moolgavkar, S. H., Luebeck, E. G., de Gunst, M., Port, R. E., & Schwarz, M. (1990). Quantitative analysis of enzyme altered foci in rat hepatocarcinogenesis experiments. Carcinogenesis, 11, 1271–1278.CrossRefGoogle Scholar
  54. 54.
    Moolgavkar, S. H., Luebeck, E. G., Turim, J., & Brown, R. C. (2000). Lung cancer risk associated with exposure to man-made fibers. Drug and Chemical Toxicology, 23(1), 223–242.CrossRefGoogle Scholar
  55. 55.
    Moolgavkar, S. H., Meza, R., & Turim, J. (2009). Pleural and peritoneal mesotheliomas in SEER: Age effects and temporal trends, 1973–2005. Cancer Causes & Control, 20(6), 935–944.CrossRefGoogle Scholar
  56. 56.
    Moolgavkar, S. H., Turim, J., Alexander, D. D., Lau, E. C., & Cushing, C. A. (2010). Potency factors for risk assessment at Libby, Montana. Risk Analysis, 30, 1240–1248.CrossRefGoogle Scholar
  57. 57.
    Moolgavkar, S. H., & Venzon, D. J. (1979). Two-event model for carcinogenesis: Incidence curves for childhood and adult tumors. Mathematical Biosciences, 47, 55–77.CrossRefzbMATHGoogle Scholar
  58. 58.
    Peto, J. (2012). That the effects of smoking should be measured in pack-years: Misconceptions. British Journal of Cancer, 107(3), 406–407.CrossRefGoogle Scholar
  59. 59.
    Peto, J., Seidman, H., & Selikoff, I. J. (1982). Mesothelioma mortality in asbestos workers: Implications for models of carcinogenesis and risk assessment. British Journal of Cancer, 45, 124–135.CrossRefGoogle Scholar
  60. 60.
    Peto, R. (1972). Contribution to discussion paper by D. R. Cox: Regression models and life-tables. Journal of the Royal Statistical Society, Serial B, 34, 205–207.MathSciNetGoogle Scholar
  61. 61.
    Poole, C. (2010). On the origin of risk relativism. Epidemiology, 21, 3–9.CrossRefGoogle Scholar
  62. 62.
    Prentice, R. L., & Breslow N. E. (1978). Retrospective studies and failure time models. Biometrika, 65(1), 153–158.CrossRefzbMATHGoogle Scholar
  63. 63.
    Prentice, R. L., & Kalbfleisch, J. D. (1979). Hazard rate models with covariates. Biometrics, 35(1), 25–39.CrossRefMathSciNetzbMATHGoogle Scholar
  64. 64.
    Rachet, B., Siemiatycki, J., Abrahamowicz, M., & Leffondre, K. (2004). A flexible modeling approach to estimating the component effects of smoking behavior on lung cancer. Journal of Clinical Epidemiology, 57, 1076–1085.CrossRefGoogle Scholar
  65. 65.
    Richardson, D. B. (2008). Temporal variation in the association between benzene and leukemia mortality. Environmental Health Perspectives, 116, 370–374.CrossRefGoogle Scholar
  66. 66.
    Richardson, D. B. (2009). Multistage modeling of leukemia in benzene workers: a simple approach to fitting the 2-stage clonal expansion model. American Journal of Epidemiology, 169, 78–85.CrossRefGoogle Scholar
  67. 67.
    Sullivan, P. A. (2007). Vermiculite, respiratory disease, and asbestos exposure in Libby, Montana. Update of a cohort mortality study. Environmental Health Perspectives, 115, 579–585.CrossRefGoogle Scholar
  68. 68.
    Surveillance, Epidemiology, and End Results (SEER) Program Research Data. (1973–2015). National Cancer Institute, DCCPS, Surveillance Research Program, released April 2018, based on the November 2017 submission.
  69. 69.
    Thomas, D. C. (1983). Statistical methods for analyzing effects of temporal patterns of exposure on cancer risks. Scandinavian Journal of Work, Environment & Health, 9, 353–366.CrossRefGoogle Scholar
  70. 70.
    Thomas, D. C. (1988). Models for exposure-time-response relationships with applications to cancer epidemiology. Annual Review of Public Health, 9, 451–482.CrossRefGoogle Scholar
  71. 71.
    Thomas, D. C. (2014). Invited commentary: Is it time to retire the pack-years variable? Maybe not! American Journal of Epidemiology, 179(3), 299–302.CrossRefGoogle Scholar
  72. 72.
    Tomasetti, C., Li, L., & Vogelstein, B. (2017). Stem cell divisions, somatic mutations, cancer etiology, and cancer prevention. Science, 355(6331), 1330–1334.CrossRefGoogle Scholar
  73. 73.
    Tomasetti, C., Marchionni, L., Nowak, M. A., Parmigiani, G., & Vogelstein, B. (2015). Only three driver gene mutations are required for the development of lung and colorectal cancers. Proceedings of the National Academy of Sciences, 112(1), 118–123.CrossRefGoogle Scholar
  74. 74.
    Tomasetti, C., & Vogelstein, B. (2015). Variation in cancer risk among tissues can be explained by the number of stem cell divisions. Science, 347(6217), 78–81.CrossRefGoogle Scholar
  75. 75.
    Triebig, G. (2010). Implications of latency period between benzene exposure and development of leukemia – A synopsis of literature. Chemico-Biological Interactions, 184, 26–29.CrossRefGoogle Scholar
  76. 76.
    Vogelstein, B., & Kinzler, K. W. (2015). The path to cancer – Three strikes and you’re out. The New England Journal of Medicine, 373(20), 1895–1898.CrossRefGoogle Scholar
  77. 77.
    Whittemore, A. S. (1977). The age distribution of human cancer for carcinogenic exposures of varying intensity. American Journal of Epidemiology, 109, 709–718.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  • Suresh Moolgavkar
    • 1
    • 2
  • Georg Luebeck
    • 3
  1. 1.Fred Hutchinson Cancer Research CenterSeattleUSA
  2. 2.Exponent, Inc.BellevueUSA
  3. 3.Fred Hutchinson Cancer Research CenterSeattleUSA

Personalised recommendations