Optimizing mHealth Interventions with a Bandit

  • Mashfiqui RabbiEmail author
  • Predrag Klasnja
  • Tanzeem Choudhury
  • Ambuj Tewari
  • Susan Murphy
Part of the Studies in Neuroscience, Psychology and Behavioral Economics book series (SNPBE)


Mobile health (mHealth) interventions can improve health outcomes by intervening in the moment of need or in the right life circumstance. mHealth interventions are now technologically feasible because current off-the-shelf mobile phones can acquire and process data in real time to deliver relevant interventions in the moment. Learning which intervention to provide in the moment, however, is an optimization problem. This book chapter describes one algorithmic approach, a “bandit algorithm,” to optimize mHealth interventions. Bandit algorithms are well-studied and are commonly used in online recommendations (e.g., Google’s ad placement, or news recommendations). Below, we walk through simulated and real-world examples to demonstrate how bandit algorithms can be used to personalize and contextualize mHealth interventions. We conclude by discussing challenges in developing bandit-based mhealth interventions.



This work has been supported by NIDA P50 DA039838 (PI Linda Collins), NIAAA R01 AA023187 (PI S. Murphy), NHLBI/NIA R01 HL125440 (PI: PK), NIBIB U54EB020404 (PI: SK). A. Tewari acknowledges the support of a Sloan Research Fellowship and an NSF CAREER grant IIS-1452099.


  1. Auer P (2002) Using confidence bounds for exploitation-exploration trade-offs. J Mach Learn Res 3(Nov):397–422Google Scholar
  2. Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Mach Learn 47(2–3):235–256CrossRefGoogle Scholar
  3. Baumeister H, Kraft R, Baumel A, Pryss R, Messner E-M (2019) Persuasive e-health design for behavior change. In: Baumeister H, Montag C (eds) Mobile sensing and digital phenotyping: new developments in psychoinformatics. Springer, Berlin, pp x–xGoogle Scholar
  4. Bishop CM (2007) Pattern recognition and machine learning. SpringerGoogle Scholar
  5. Bubeck S, Cesa-Bianchi N (2012) Regret analysis of stochastic and nonstochastic multi-armed bandit problems. Found Trends® Mach Learn 5(1):1–122Google Scholar
  6. Chapelle O, Li L (2011) An empirical evaluation of thompson sampling. In: Advances in neural information processing systems, pp 2249–2257Google Scholar
  7. Chapelle O, Joachims T, Radlinski F, Yue Y (2012) Large-scale validation and analysis of interleaved search evaluation. ACM Trans Inf Syst (TOIS) 30(1):6CrossRefGoogle Scholar
  8. Fogg BJ (2009) A behavior model for persuasive design. In: Proceedings of the 4th international conference on persuasive technology, ACM, vol 40Google Scholar
  9. Hochbaum G, Rosenstock I, Kegels S (1952) Health belief model. United States Public Health ServiceGoogle Scholar
  10. Klasnja P, Hekler EB, Shiffman S, Boruvka A, Almirall D, Tewari A, Murphy SA (2015) Microrandomized trials: an experimental design for developing just-in-time adaptive interventions. Health Psychol 34(S):1220Google Scholar
  11. Kubiak T, Smyth JM (2019) Connecting domains—ecological momentary assessment in a mobile sensing framework. In: Baumeister H, Montag C (eds) Mobile sensing and digital phenotyping: new developments in psychoinformatics. Springer, Berlin, pp x–xGoogle Scholar
  12. Lei, H., Tewari, A., & Murphy, S. (2014) An actor-critic contextual bandit algorithm for personalized interventions using mobile devices. Advances in Neural Information Processing Systems, 27Google Scholar
  13. Li L, Chu W, Langford J, Schapire RE (2010) A contextual-bandit approach to personalized news article recommendation. In: Proceedings of the 19th international conference on World wide web, ACM, pp 661–670Google Scholar
  14. Messner E-M, Probst T, O’Rourke T, Baumeister H., Stoyanov S (2019) mHealth applications: potentials, limitations, current quality and future directions. In: Baumeister H, Montag C (eds) Mobile sensing and digital phenotyping: new developments in psychoinformatics. Springer, Berlin, pp x–xGoogle Scholar
  15. Nahum-Shani I, Smith SN, Spring BJ, Collins LM, Witkiewitz K, Tewari A, Murphy SA (2017) Just-in-time adaptive interventions (JITAIs) in mobile health: key components and design principles for ongoing health behavior support. Ann Behav Med 52(6):446–462CrossRefGoogle Scholar
  16. Rabbi M, Aung MH, Zhang M, Choudhury T (2015) MyBehavior: automatic personalized health feedback from user behaviors and preferences using smartphones. In: Proceedings of the 2015 ACM international joint conference on pervasive and ubiquitous computing, pp 707–718Google Scholar
  17. Rabbi M, Aung MH, Choudhury T (2017) Towards health recommendation systems: an approach for providing automated personalized health feedback from mobile data. Mobile health. Springer, Cham, pp 519–542CrossRefGoogle Scholar
  18. Rabbi M, Aung MS, Gay G, Reid MC, Choudhury T (2018) Feasibility and acceptability of mobile phone-based auto-personalized physical activity recommendations for chronic pain self-management: pilot study on adults. J Med Internet Res 20(10):e10147CrossRefGoogle Scholar
  19. Sutton RS, Barto AG (1998) Reinforcement learning: an introduction. MIT PressGoogle Scholar
  20. Woodroofe M (1979) A one-armed bandit problem with a concomitant variable. J Am Stat Assoc 74(368):799–806CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Mashfiqui Rabbi
    • 1
    Email author
  • Predrag Klasnja
    • 2
  • Tanzeem Choudhury
    • 3
  • Ambuj Tewari
    • 4
  • Susan Murphy
    • 5
  1. 1.Department of StatisticsHarvard UniversityCambridgeUSA
  2. 2.School of InformationUniversity of MichiganAnn ArborUSA
  3. 3.Department of Information ScienceCornell UniversityIthacaUSA
  4. 4.Department of StatisticsUniversity of MichiganAnn ArborUSA
  5. 5.Department of Statistics and Department of Computer ScienceHarvard UniversityCambridgeUSA

Personalised recommendations