Estimation of Optimal DTRs by Directly Modeling Regimes

  • Bibhas Chakraborty
  • Erica E. M. Moodie
Part of the Statistics for Biology and Health book series (SBH)


In this chapter, we consider several approaches to estimating the optimal dynamic treatment regime by directly modeling the regimes as opposed to modeling the conditional mean outcome: inverse probability of treatment weighting, marginal structural models, and classification-based methods. The fundamental difference between the approaches considered in the current chapter and those considered in previous chapters (e.g. Q-learning and G-estimation) lies in the primary target of estimation (and inference): the methods considered presently target the parameters of the decision rule itself.


Generalization Error Baseline Covariates Augmented Data Inverse Probability Weighting Treatment Rule 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. Bembom, O., & Van der Laan, M. J. (2007). Statistical methods for analyzing sequentially randomized trials. Journal of the National Cancer Institute99, 1577–1582.CrossRefGoogle Scholar
  2. Bertsekas, D. P., & Tsitsiklis, J. (1996). Neuro-dynamic programming. Belmont: Athena Scientific.MATHGoogle Scholar
  3. Carlin, B. P., Kadane, J. B., & Gelfand, A. E. (1998). Approaches for optimal sequential decision analysis in clinical trials. Biometrics54, 964–975.MATHCrossRefGoogle Scholar
  4. Cotton, C. A., & Heagerty, P. J. (2011). A data augmentation method for estimating the causal effect of adherence to treatment regimens targeting control of an intermediate measure. Statistics in Bioscience3, 28–44.CrossRefGoogle Scholar
  5. Cox, D. R. (1958). Planning of experiments. New York: Wiley.MATHGoogle Scholar
  6. Henderson, R., Ansell, P., & Alshibani, D. (2010). Regret-regression for optimal dynamic treatment regimes. Biometrics66, 1192–1201.MathSciNetMATHCrossRefGoogle Scholar
  7. Hernán, M. A., & Robins, J. M. (2013). Causal inference. Chapman & Hall/CRC (in revision).Google Scholar
  8. Hernán, M. A., Hernández-Díaz, S., & Robins, J. M. (2004). A structural approach to selection bias. Epidemiology15, 615–625.CrossRefGoogle Scholar
  9. Hirano, K., & Porter, J. (2009). Asymptotics for statistical treatment rules. Econometrica77, 1683–1701.MathSciNetMATHCrossRefGoogle Scholar
  10. Kasari, C. (2009). Developmental and augmented intervention for facilitating expressive language (ccnia). Bethesda: National Institutes of Health.
  11. Kramer, M. S., Chalmers, B., Hodnett, E. D., Sevkovskaya, Z., Dzikovich, I., Shapiro, S., Collet, J., Vanilovich, I., Mezen, I., Ducruet, T., Shishko, G., Zubovich, V., Mknuik, D., Gluchanina, E., Dombrovsky, V., Ustinovitch, A., Ko, T., Bogdanovich, N., Ovchinikova, L., & Helsing, E. (2001). Promotion of Breastfeeding Intervention Trial (PROBIT): A randomized trial in the Republic of Belarus. Journal of the American Medical Association285, 413–420.CrossRefGoogle Scholar
  12. Lindley, D. V. (1985). Making decisions (2nd ed.). New York: Wiley.Google Scholar
  13. Moodie, E. E. M. (2009a). A note on the variance of doubly-robust G-estimates. Biometrika96, 998–1004.MathSciNetMATHCrossRefGoogle Scholar
  14. Moodie, E. E. M., & Richardson, T. S. (2010). Estimating optimal dynamic regimes: Correcting bias under the null. Scandinavian Journal of Statistics37, 126–146.MathSciNetMATHCrossRefGoogle Scholar
  15. Murphy, S. A. (2003). Optimal dynamic treatment regimes (with Discussion). Journal of the Royal Statistical Society, Series B65, 331–366.MATHCrossRefGoogle Scholar
  16. Murphy, S. A., & Bingham, D. (2009). Screening experiments for developing dynamic treatment regimes. Journal of the American Statistical Association184, 391–408.MathSciNetCrossRefGoogle Scholar
  17. Murphy, S. A., Lynch, K. G., Oslin, D., Mckay, J. R., & TenHave, T. (2007a). Developing adaptive treatment strategies in substance abuse research. Drug and Alcohol Dependence88, s24–s30.CrossRefGoogle Scholar
  18. Newey, W. K., & McFadden, D. (1994). Large sample estimation and hypothesis testing. In R. F. Engle & D. L. McFadden (Eds.), Handbook of econometrics (Vol. IV, pp. 2113–2245). Amsterdam/Oxford: Elsevier Science.Google Scholar
  19. Oetting, A. I., Levy, J. A., Weiss, R. D., & Murphy, S. A. (2011). Statistical methodology for a SMART design in the development of adaptive treatment strategies. In: P. E. Shrout, K. M. Keyes, & K. Ornstein (Eds.) Causality and Psychopathology: Finding the Determinants of Disorders and their Cures (pp. 179–205). Arlington: American Psychiatric Publishing.Google Scholar
  20. Orellana, L., Rotnitzky, A., & Robins, J. M. (2010b). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part II: Proofs and additional results. The International Journal of Biostatistics6.Google Scholar
  21. Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning49, 161–178.MATHCrossRefGoogle Scholar
  22. Petersen, M. L., Porter, K. E., Gruber, S., Wang, Y., & Van der Laan, M. J. (2012). Diagnosing and responding to violations in the positivity assumption. Statistical Methods in Medical Research21, 31–54.MathSciNetCrossRefGoogle Scholar
  23. Robins, J. M. (1999b). Association, causation, and marginal structural models. Synthese121, 151–179.MathSciNetMATHCrossRefGoogle Scholar
  24. Robins, J. M. (2004). Optimal structural nested models for optimal sequential decisions. In D. Y. Lin & P. Heagerty (Eds.), Proceedings of the second Seattle symposium on biostatistics (pp. 189–326). New York: Springer.CrossRefGoogle Scholar
  25. Robins, J. M., & Hernán, M. A. (2009). Estimation of the causal effects of time-varying exposures. In G. Fitzmaurice, M. Davidian, G. Verbeke, & G. Molenberghs (Eds.), Longitudinal data analysis. Boca Raton: Chapman & Hall/CRC.Google Scholar
  26. Robins, J. M., Orellana, L., & Rotnitzky, A. (2008). Estimation and extrapolation of optimal treatment and testing strategies. Statistics in Medicine27, 4678–4721.MathSciNetCrossRefGoogle Scholar
  27. Rosenbaum, P. R. (1991). Discussing hidden bias in observational studies. Annals of Internal Medicine115, 901–905.CrossRefGoogle Scholar
  28. Rush, A. J., Fava, M., Wisniewski, S. R., Lavori, P. W., Trivedi, M. H., Sackeim, H. A., Thase, M. E., Nierenberg, A. A., Quitkin, F. M., Kashner, T. M., Kupfer, D. J., Rosenbaum, J. F., Alpert, J., Stewart, J. W., McGrath, P. J., Biggs, M. M., Shores-Wilson, K., Lebowitz, B. D., Ritz, L., & Niederehe, G. (2004). Sequenced treatment alternatives to relieve depression (STAR*D): Rationale and design. Controlled Clinical Trials25, 119–142.CrossRefGoogle Scholar
  29. Shepherd, B. E., Jenkins, C. A., Rebeiro, P. F., Stinnette, S. E., Bebawy, S. S., McGowan, C. C., Hulgan, T., & Sterling, T. R. (2010). Estimating the optimal CD4 count for HIV-infected persons to start antiretroviral therapy. Epidemiology21, 698–705.CrossRefGoogle Scholar
  30. Shortreed, S. M., Laber, E., & Murphy, S. A. (2010). Imputation methods for the clinical antipsychotic trials of intervention and effectiveness study (Technical report SOCS-TR-2010.8). School of Computer Science, McGill University.Google Scholar
  31. Shortreed, S. M., Laber, E., Lizotte, D. J., Stroup, T. S., Pineau, J., & Murphy, S. A. (2011). Informing sequential clinical decision-making through reinforcement learning: An empirical study. Machine Learning84, 109–136.CrossRefGoogle Scholar
  32. Stroup, T. S., Lieberman, J. A., McEvoy, J. P., Davis, S. M., Meltzer, H. Y., Rosenheck, R. A., Swartz, M. S., Perkins, D. O., Keefe, R. S. E., Davis, C. E., Severe, J., & Hsiao, J. K. (2006). Effectiveness of olanzapine, quetiapine, risperidone, and ziprasidone in patients with chronic schizophrenia folllowing discontinuation of a previous atypical antipsychotic. American Journal of Psychiatry163, 611–622.CrossRefGoogle Scholar
  33. Sturmer, T., Schneeweiss, S., Brookhart, M. A., Rothman, K. J., Avorn, J., & Glynn, R. J. (2005). Analytic strategies to adjust confounding using exposure propensity scores and disease risk scores: Nonsteroidal antiinflammatory drugs and short-term mortality in the elderly. American Journal of Epidemiology161, 891–898.CrossRefGoogle Scholar
  34. Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT.Google Scholar
  35. Taubman, S. L., Robins, J. M., Mittleman, M. A., & Hernán, M. A. (2009). Intervening on risk factors for coronary heart disease: An application of the parametric g-formula. International Journal of Epidemiology38, 1599–1611.CrossRefGoogle Scholar
  36. Van der Laan, M. J., & Petersen, M. L. (2007b). Statistical learning of origin-specific statically optimal individualized treatment rules. The International Journal of Biostatistics3.Google Scholar
  37. Van der Laan, M. J., & Rubin, D. (2006). Targeted maximum likelihood learning. The International Journal of Biostatistics2.Google Scholar
  38. Van der Vaart, A. W. (1998). Asymptotic statistics. Cambridge, UK: Cambridge University Press.MATHCrossRefGoogle Scholar
  39. Young, J. G., Cain, L. E., Robins, J. M., O’Reilly, E. J., & Hernán, M. A. (2011). Comparative effectiveness of dynamic treatment regimes: An application of the parametric G-formula. Statistics in Biosciences1, 119–143.CrossRefGoogle Scholar
  40. Zhang, B., Tsiatis, A. A., Davidian, M., Zhang, M., & Laber, E. B. (2012a). Estimating optimal treatment regimes from a classification perspective. Stat1, 103–114.CrossRefGoogle Scholar
  41. Zhang, B., Tsiatis, A. A., Laber, E. B., & Davidian, M. (2012b). A robust method for estimating optimal treatment regimes. Biometrics, 68, 1010–1018.MATHCrossRefGoogle Scholar
  42. Zhao, Y., Kosorok, M. R., & Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in Medicine28, 3294–3315.MathSciNetCrossRefGoogle Scholar
  43. Zou, H. (2006). The adaptive lasso and its oracle properties. Journal of the American Statistical Association101, 1418–1429.MathSciNetMATHCrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Bibhas Chakraborty
    • 1
  • Erica E. M. Moodie
    • 2
  1. 1.Department of BiostatisticsColumbia UniversityNew YorkUSA
  2. 2.Department of Epidemiology, Biostatistics, and Occupational HealthMcGill UniversityMontrealCanada

Personalised recommendations