Skip to main content

A Patient-Centered Proposal for Bayesian Analysis of Self-Experiments for Health


The rise of affordable sensors and apps has enabled people to monitor various health indicators via self-tracking. This trend encourages self-experimentation, a subset of self-tracking in which a person systematically explores potential causal relationships to try to answer questions about their health. Although recent research has investigated how to support the data collection necessary for self-experiments, less research has considered the best way to analyze data resulting from these self-experiments. Most tools default to using traditional frequentist methods. However, the US Agency for Healthcare Research and Quality recommends using Bayesian analysis for n-of-1 studies, arguing from a statistical perspective. To develop a complementary patient-centered perspective on the potential benefits of Bayesian analysis, this paper describes types of questions people want to answer via self-experimentation, as informed by (1) our experiences engaging with irritable bowel syndrome patients and their healthcare providers and (2) a survey investigating what questions individuals want to answer about their health and wellness. We provide examples of how those questions might be answered using (1) frequentist null hypothesis significance testing, (2) frequentist estimation, and (3) Bayesian estimation and prediction. We then provide design recommendations for analyses and visualizations that could help people answer and interpret such questions. We find the majority of the questions people want to answer with self-experimentation data are better answered with Bayesian methods than with frequentist methods. Our results therefore provide patient-centered support for the use of Bayesian analysis for n-of-1 studies.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16


  1. 1.

    So long as one makes use of informed priors (as we advocate here) and/or applies a hierarchical modeling approach.

  2. 2.

    The widespread use of this null ritual in scientific fields is not without criticism [78]. Most pointedly, Gigerenzer went so far as to declare it a symptom of “mindless statistics” [69]. We will describe why we believe it is not applicable to small self-experiments but leave aside the question of its broader applicability to science.

  3. 3.

    Readers familiar with standardized effect sizes (like Cohen’s d) might ask why we do not use them here. Like Cummings [79], we believe that unstandardized effect sizes (e.g., mean differences) are easier to interpret, particularly for individual decision-making (a person should know what one point on a pain scale that they have used means to them; they are less likely to know what a difference of 1 standard deviation means).

  4. 4.

    We do not discuss the use of Bayes factors—one approach to Bayesian hypothesis testing—in this paper, as the sensitivity of Bayes factors to irrelevant details of the prior make them difficult even for experienced analysts to use in practice [80]. Instead, if hypothesis testing is desired, we prefer estimation-based approaches, such as regions of practical equivalence, which we believe are also easier to interpret. Regions of practical equivalence answer questions like “how likely is the effect to be 0 (or close enough to 0 that I will not care)?” [46, 80].

  5. 5.

    We used a variant of our Bayesian regression model with flat priors (i.e., priors in which all possible outcomes are equally likely, which is the implicit assumption a frequentist analysis makes) on the parameters to simulate the frequentist regression.


  1. 1.

    Global Status Report on Noncommunicable Diseases. Geneva: World Health Organization; 2014

  2. 2.

    Mamykina L, Mynatt ED, Kaufman DR (2006) Investigating health management practices of individuals with diabetes. Proc SIGCHI Conf Hum Factors Comput Syst - CHI ‘06. :927

  3. 3.

    Riggare S, Unruh KT, Sturr J, Domingos J (2017) Patient-driven n-of-1 in Parkinson’s disease. 123–8

  4. 4.

    Mamykina L, Heitkemper EM, Smaldone AM, Kukafka R, Cole-Lewis HJ, Davidson PG, Mynatt ED, Cassells A, Tobin JN, Hripcsak G (2017) Personal discovery in diabetes self-management: discovering cause and effect using self-monitoring data. J Biomed Inform. 76(June):1–8

    Article  Google Scholar 

  5. 5.

    Cepeda MS, Acevedo JC, Hernando A, Miranda N, Cortes C, Carr DB (2008) An n-of-1 trial as an aid to decision-making prior to implanting a permanent spinal cord stimulator. Pain Med (United States) 9(2):235–239

    Article  Google Scholar 

  6. 6.

    Choe EK, Lee NB, Lee B, Pratt W, Kientz JA (2014) Understanding quantified-selfers’ practices in collecting and exploring personal data. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2014). New York, New York, USA; p. 1143–52

  7. 7.

    Nediyana D, Metaxa-Kakavouli D, Tran A, Nugent N, Boergers J, McGeary J, Huang J (2016) SleepCoacher: a personalized automated self-experimentation system for sleep recommendations. In: Proc ACM Symp User Interface Softw Technol (UIST 2016). p. 347–58

  8. 8.

    Karkar R, Zia JK, Vilardaga R, Mishra SR, Fogarty J, Munson SA, Kientz JA (2016) A framework for self-experimentation in personalized health. J Am Med Informatics Assoc. 23(3):440–448

    Article  Google Scholar 

  9. 9.

    Karkar R, Schroeder J, Epstein DA, Pina LR, Scofield J, Fogarty J, Kientz JA, Munson SA, Vilardaga R, Zia JK (2017) TummyTrials: a feasibility study of using self-experimentation to detect individualized food triggers. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2017). p. 6850–63

  10. 10.

    Kravitz RL, Duan MSPHN (2014) Panel De.M.C.N.-1 G. Design and implementation of n-of-1 trials: a user’s guide. Agency Healthc Res Qual 13(14):1–88

    Google Scholar 

  11. 11.

    Gelman A, Weakliem D (2008) Of beauty, sex, and power: statistical challenges in estimating small effects. Am Sci 97(4):310–316

    Article  Google Scholar 

  12. 12.

    Gelman A, Carlin J (2014) Beyond power calculations: assessing type S (sign) and type M (magnitude) errors. Perspect Psychol Sci. 9(6):641–651

    Article  Google Scholar 

  13. 13.

    Kay M, Nelson GL, Hekler EB (2016) Researcher-centered design of statistics: why Bayesian statistics better fit the culture and incentives of HCI. Proc 2016 CHI Conf Hum Factors Comput Syst. :4521–32

  14. 14.

    Schroeder J, Hoffswell J, Chung C-F, Fogarty J, Munson S, Zia JK (2017) Supporting patient-provider collaboration to identify individual triggers using food and symptom journals. Proc 2017 ACM Conf Comput Support Coop Work Soc Comput - CSCW ‘17. :1726–39

  15. 15.

    Li I, Dey AK, Forlizzi J (2010) A stage-based model of personal informatics systems. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2010). New York, New York, USA; p. 557–66

  16. 16.

    Epstein DA, Ping A, Fogarty J, Munson SA (2015) A lived informatics model of personal informatics. In: Proc ACM Int Jt Conf Pervasive Ubiquitous Comput (UbiComp 2015). p. 731–42

  17. 17.

    Mamykina L, Smaldone AM, Bakken SR (2015) Adopting the sensemaking perspective for chronic disease self-management. J Biomed Inform. 56:406–417

    Article  Google Scholar 

  18. 18.

    Rooksby J, Rost M, Morrison A, Chalmers MC (2014) Personal tracking as lived informatics. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2014). New York, New York, USA; p. 1163–72

  19. 19.

    Chung C-F, Cook J, Bales E, Zia JK, Munson SA (2015) More than telemonitoring: health provider use and nonuse of life-log data in irritable bowel syndrome and weight management. J Med Internet Res 17(8):e203

    Article  Google Scholar 

  20. 20.

    Park SY, Chen Y (2015) Individual and social recognition: challenges and opportunities in migraine management. In: Proc ACM Conf Comput Support Coop Work Soc Comput. ACM Press, New York, USA, pp 1540–1551

    Google Scholar 

  21. 21.

    Mamykina L, Mynatt E, Davidson P, Greenblatt D (2008) MAHI: investigation of social scaffolding for reflective thinking in diabetes management. In: Proc SIGCHI Conf Hum Factors Comput Syst (CHI 2008). p. 477–86

  22. 22.

    Schroeder J, Chung C-F, Epstein DA, Karkar R, Parsons A, Murinova N, Fogarty J, Munson SA (2018) Examining self-tracking by people with migraine: goals, needs, and opportunities in a chronic health condition. In: Proc ACM Conf Des Interact Syst (DIS 2018) To Appear.

  23. 23.

    Consolvo S, McDonald DW, Toscos T, Chen MY, Froehlich JE, Harrison BL, Klasnja P, La Marca A, Le Grand L, Libby R, Smith IE, Landay JA (2008) Activity sensing in the wild: a field trial of Ubifit Garden. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2008). p. 1797–806

  24. 24.

    Fitbit [Internet]

  25. 25.

    Jawbone UpBand [Internet]

  26. 26.

    Larklife [Internet]

  27. 27.

    Lin J.J., Mamykina L, Lindtner S, Delajoux G, Strub HB (2006) Fish’n’Steps: encouraging physical activity with an interactive computer game. Ubiquitous Comput (UbiComp 2006). 261–78

  28. 28.

    Nike Fuelband [Internet]

  29. 29.

    Kay M, Choe EK, Shepherd J, Greenstein B, Watson NF, Consolvo S, Kientz JA (2012) Lullaby: a capture & access system for understanding the sleep environment. In: Proc ACM Conf Ubiquitous Comput (UbiComp 2012). p. 226–34

  30. 30.

    Baumer EPS, Katz SJ, Freeman JE, Adams P, Gonzales AL, Pollak J, Retelny D, Niederdeppe J, Olson CM, Gay GK (2012) Prescriptive persuasion and open-ended social awareness: expanding the design space of mobile health. In: Proc ACM Conf Comput Support Coop Work (CSCW 2012). p. 475–84

  31. 31.

    Cordeiro F, Bales E, Cherry E, Fogarty J (2015) Rethinking the mobile food journal: exploring opportunities for lightweight photo-based capture. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2015). p. 3207–16

  32. 32.

    Ali AA, Hossain SM, Hovsepian K, Plarre K, Kumar S (2012) mPuff: automated detection of cigarette smoking puffs from respiration measurements. In: Proc Conf Inf Process Sens Networks (ISPN 2012). p. 269–80

  33. 33.

    Morris M, Guilak F (2009) Mobile heart health: project highlight. IEEE Pervasive Comput. 8(2):57–61

    Article  Google Scholar 

  34. 34.

    Jorgensen JT (2009) New era of personalized medicine: a 10-year anniversary. Oncologist. 14(5):557–558

    Article  Google Scholar 

  35. 35.

    Swan M (2009) Emerging patient-driven health care models: an examination of health social networks, consumer personalized medicine and quantified self-tracking. Int J Environ Res Public Health. 6(2):492–525

    Article  Google Scholar 

  36. 36.

    Lillie EO, Patay B, Diamant J, Issell B, Topol EJ, Schork NJ (2011) The n-of-1 clinical trial: the ultimate strategy for individualizing medicine? Per Med 8(2):161–173

    Article  Google Scholar 

  37. 37.

    Riley WT, Glasgow RE, Etheredge L, Abernethy AP (2013) Rapid, responsive, relevant (r3) research: a call for a rapid learning health research enterprise. Clin Transl Med 2(1):10

    Article  Google Scholar 

  38. 38.

    Barlow DH, Hayes SC (1979) Alternating treatments design: one strategy for comparing the effects of two treatments in a single subject. J Appl Behav Anal. 12(2):199–210

    Article  Google Scholar 

  39. 39.

    Larson EB (1990) N-of-1 clinical trials: a technique for improving medical therapeutics. West J Med 152(1):52–56

    Google Scholar 

  40. 40.

    Barlow DH, Nock MK, Hersen M (2008) Single case experimental designs: strategies for studying behavior change. Third. Pearson; 416

  41. 41.

    Barr C, Marois M, Sim I, Schmid CH, Wilsey B, Ward D, Duan N, Hays RD, Selsky J, Servadio J, Schwartz M, Dsouza C, Dhammi N, Holt Z, Baquero V, MacDonald S, Jerant A, Sprinkle R, Kravitz RL (2015) The PREEMPT study—evaluating smartphone-assisted n-of-1 trials in patients with chronic pain: study protocol for a randomized controlled trial. Trials 16:67

    Article  Google Scholar 

  42. 42.

    PACO: The Personal Analytics Companion [Internet]

  43. 43.

    Tiralist - ohmage [Internet]

  44. 44.

    Daskalova N, Desingh K, Kim JY, Zhang L, Papoutsaki A, Huang J (2017) Lessons learned from two cohorts of personal informatics self-experiments. In: Proc ACM Conf Ubiquitous Comput. p. 46

  45. 45.

    Lee J, Walker E, Burleson W, Kay M, Buman M, Hekler EB (2017) Self-experimentation for behavior change: design and formative evaluation of two approaches. In: Proc SIGCHI Conf Hum Factors Comput Syst. p. 6837–49

  46. 46.

    Kruschke JK, Liddell TM (2017) The Bayesian new statistics : hypothesis testing, estimation, meta-analysis, and planning from a Bayesian perspective. Psychon Bull Rev. :1–29

  47. 47.

    Gelman A, Hill J, Yajima M (2012) Why we (usually) don’t have to worry about multiple comparisons. J Res Educ Eff 5(2):189–211.

    Article  Google Scholar 

  48. 48.

    Elsenbruch S (2011) Abdominal pain in irritable bowel syndrome: a review of putative psychological, neural and neuro-immune mechanisms. Brain Behav Immun. 25(3):386–394

    Article  Google Scholar 

  49. 49.

    Lovell RM, Ford AC ((2012)) Effect of gender on prevalence of irritable bowel syndrome in the community: systematic review and meta-analysis. Am J Gastroenterol. 107:991–1000

  50. 50.

    Ladabaum U, Boyd E, Zhao WK, Mannalithara A, Sharabidze A, Singh G, Chung E, Levin TR (2012) Diagnosis, comorbidities, and management of irritable bowel syndrome in patients in a large health maintenance organization. Clin Gastroenterol Hepatol. 10(1):37–45

    Article  Google Scholar 

  51. 51.

    Mitra D, Davis KL, Baran RW (2011) All-cause healthcare charges among managed care patients with constipation and comorbid irritable bowel syndrome. Postgrad Med. 123(3):122–132

    Article  Google Scholar 

  52. 52.

    Harris LR, Roberts L (2008) Treatments for irritable bowel syndrome: patients’ attitudes and acceptability. BMC Complement Altern Med. 8:65

    Article  Google Scholar 

  53. 53.

    Heitkemper M, Carter E, Ameen V, Olden K, Cheng L (2002) Women with irritable bowel syndrome: differences in patients’ and physicians’ perceptions. Gastroenterol Nurs 25(5):192–200

    Article  Google Scholar 

  54. 54.

    Monsbakken K, Vandvik P, Farup P (2006) Perceived food intolerance in subjects with irritable bowel syndrome—etiology, prevalence and consequences. Eur J Clin Nutr 60(5):667–672

    Article  Google Scholar 

  55. 55.

    Simrén M, Månsson A, Langkilde AM, Svedlund J, Abrahamsson H, Bengtsson U, Björnsson ES (2001) Food-related gastrointestinal symptoms in the irritable bowel syndrome. Digestion 63(2):108–115

    Article  Google Scholar 

  56. 56.

    Zia JK, Barney P, Cain KC, Jarrett ME, Heitkemper MM (2016) A comprehensive self-management irritable bowel syndrome program produces sustainable changes in behavior after 1 year. Clin Gastroenterol Hepatol 14(2):212–219

    Article  Google Scholar 

  57. 57.

    Parker TJ, Naylor SJ, Riordan AM, Hunter JO (1995) Management of patients with food intolerance in irritable bowel syndrome: the development and use of an exclusion diet. J Hum Nutr Diet 8(3):159–166

    Article  Google Scholar 

  58. 58.

    American Gastroenterological Association. American Gastroenterological Association Medical Position Statement: Irritable Bowel Syndrome. Vol. 123, Gastroenterology. American Gastroenterology Association; p. 2105–72002

  59. 59.

    Zia JK, Chung C-F, Xu K, Dong Y, Cain KC, Munson SA, Heitkemper MM Inter-rater reliability of healthcare provider interpretations of food and gastrointestinal symptom paper diaries of patients with irritable bowel syndrome. :In Preparation

  60. 60.

    Choe EK, Duarte ME, Kientz JA (2010) Understanding and designing computing technologies that convey concerning health news. In: Proc Int Conf Des Emot (D&E 2010). p. 1–12

  61. 61.

    Eswaran S, Tack J, Chey WD (2011) Food: the forgotten factor in the irritable bowel syndrome. Gastroenterol Clin N Am 40(1):141–162

    Article  Google Scholar 

  62. 62.

    Loken E, Gelman A (2017) Measurement error and the replication crisis. Science (80- ). 355(6325):584–585

    Article  Google Scholar 

  63. 63.

    Wasserstein RL, Lazar NA (2016) The ASA’s statement on p -values: context, process, and purpose. Am Stat. 70(2):129–133

    MathSciNet  Article  Google Scholar 

  64. 64.

    Walker E, Nowacki AS (2011) Understanding equivalence and noninferiority testing. J Gen Intern Med 26(2):192–196.

    Article  Google Scholar 

  65. 65.

    Morey RD, Hoekstra R, Rouder JN, Lee MD, Wagenmakers E-J (2016) The fallacy of placing confidence in confidence intervals. Psychon Bull Rev 23(1):103–123

    Article  Google Scholar 

  66. 66.

    Hoekstra R, Morey RD, Rouder JN, Wagenmakers E-J (2014) Robust misinterpretation of confidence intervals. Psychon Bull Rev. 21(5):1157–1164

    Article  Google Scholar 

  67. 67.

    Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtari A, Rubin DB (2013) Bayesian data analysis. Third Edit. Chapman and Hall/CRC; 675 p

  68. 68.

    Goldstein DG, Rothschild D (2014) Lay understanding of probability distributions. J Soc Judgm Decis Mak 9(1):1–14

    Google Scholar 

  69. 69.

    Gigerenzer G (2004) Mindless statistics. J Socio Econ. 33(5):587–606

    Article  Google Scholar 

  70. 70.

    Benjamin D.J., Berger J.O., Johannesson M., Nosek B.A., Wagenmakers E.-J., Berk R., Bollen K.A., Brembs B., Johnson V.E., et al. (2017) Redefine statistical significance. Nat Hum Behav.

  71. 71.

    Carpenter B, Gelman A, Hoffman M, Lee D, Goodrich B, Betancourt M, Brubaker MA, Li P, Riddell A (2016) Stan: a probabilistic programming language. J Stat Softw. 76(1)

  72. 72.

    Ancker JS, Senathrajah Y, Kukafka R, Starren JB (2006) Design features of graphs in health risk communication : a systematic review. J Am Med Informatics Assoc 13(6):608–619.

    Article  Google Scholar 

  73. 73.

    Kay M, Kola T, Hullman JR, Munson SA (2016) When(ish) is my bus?: user-centered visualizations of uncertainty in everyday, mobile predictive systems. Proc ACM Conf Hum Factors Comput Syst (CHI 2016). 5092–103

  74. 74.

    Fernandes M, Walls L, Munson S, Hullman J, Kay M (2018) Uncertainty displays using quantile dotplots or CDFs improve transit decision-making. In: Proc ACM Conf Hum Factors Comput Syst (CHI 2018). p. To Appear

  75. 75.

    Scott SL, Varian HR (2014) Predicting the present with Bayesian structural time series. Int J Math Model Numer Optim. 5(1/2). doi:

  76. 76.

    Garcia-Retamero R, Cokely ET (2013) Communicating health risks with visual aids. Curr Dir Psychol Sci. 22(5):392–399

    Article  Google Scholar 

  77. 77.

    Jung MF, Sirkin D, Gür TM, Steinert M (2015) Displayed uncertainty improves driving experience and behavior. Proc 33rd Annu ACM Conf Hum Factors Comput Syst - CHI ‘15. (April):2201–10

  78. 78.

    McShane BB, Gal D, Gelman A, Robert C, Tackett JL (2017) Abandon statistical significance. 1–12

  79. 79.

    Cummings P (2011) Arguments for and against standardized mean differences (effect sizes). Arch Pediatr Adolesc Med. 165(7):592–596

    Article  Google Scholar 

  80. 80.

    Betancourt M (2018) Calibrating model-based inferences and decisions. 1–35

Download references


We thank Eric B. Heckler and Roger Vilardaga for conversations that informed this research.


This research was funded in part by a University of Washington Innovation Research Award, the National Science Foundation under awards IIS-1553167 and SCH-1344613, and the Agency for Healthcare Research Quality under award 1R21HS023654.

Author information



Corresponding author

Correspondence to Jessica Schroeder.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflicts of interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Schroeder, J., Karkar, R., Fogarty, J. et al. A Patient-Centered Proposal for Bayesian Analysis of Self-Experiments for Health. J Healthc Inform Res 3, 124–155 (2019).

Download citation


  • Self-experiment
  • N-of-1
  • Interface design
  • User-centered design
  • Self-tracking
  • Bayesian analysis