Advertisement

End-User Probabilistic Programming

  • Judith Borghouts
  • Andrew D. Gordon
  • Advait SarkarEmail author
  • Neil Toronto
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11785)

Abstract

Probabilistic programming aims to help users make decisions under uncertainty. The user writes code representing a probabilistic model, and receives outcomes as distributions or summary statistics. We consider probabilistic programming for end-users, in particular spreadsheet users, estimated to number in tens to hundreds of millions. We examine the sources of uncertainty actually encountered by spreadsheet users, and their coping mechanisms, via an interview study. We examine spreadsheet-based interfaces and technology to help reason under uncertainty, via probabilistic and other means. We show how uncertain values can propagate uncertainty through spreadsheets, and how sheet-defined functions can be applied to handle uncertainty. Hence, we draw conclusions about the promise and limitations of probabilistic programming for end-users.

Notes

Acknowledgements

Thanks to Breck Baldwin and Matthijs Vákár for information regarding Stan. We are grateful to Alan Blackwell, Eunice Jun, Tom Minka, Simon Peyton Jones for their helpful comments on a draft of this paper.

References

  1. 1.
    Benton, N., Hughes, J., Moggi, E.: Monads and effects. In: Barthe, G., Dybjer, P., Pinto, L., Saraiva, J. (eds.) APPSEM 2000. LNCS, vol. 2395, pp. 42–122. Springer, Heidelberg (2002).  https://doi.org/10.1007/3-540-45699-6_2CrossRefGoogle Scholar
  2. 2.
    Blackwell, A.F., Burnett, M.M., Peyton Jones, S.L.: Champagne prototyping: A research technique for early evaluation of complex end-user programming systems. In: VL/HCC, pp. 47–54. IEEE Computer Society (2004)Google Scholar
  3. 3.
    Borghouts, J., Gordon, A.D., Sarkar, A., O’Hara, K.P., Toronto, N.: Somewhere around that number: An interview study of how spreadsheet users manage uncertainty. arXiv preprint arXiv:1905.13072 (2019)
  4. 4.
    Boukhelifa, N., Perrin, M.E., Huron, S., Eagan, J.: How data workers cope with uncertainty: a task characterisation study. In: Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems, pp. 3645–3656. ACM (2017)Google Scholar
  5. 5.
    Braun, V., Clarke, V.: Using thematic analysis in psychology. Qual. Res. Psychol. 3(2), 77–101 (2006)CrossRefGoogle Scholar
  6. 6.
    Carpenter, B., Gelman, A., Homan, M.D., Lee, D., Goodrich, B., Betancourt, M., Brubaker, M., Guo, J., Li, P., Riddell, A.: Stan: a probabilistic programming language. J. Stat. Softw. 76(1), 1–32 (2017)CrossRefGoogle Scholar
  7. 7.
    Dourish, P.: Implications for design. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 541–550. ACM (2006)Google Scholar
  8. 8.
    Gelman, A., Hill, J.: Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press, Cambridge (2007)Google Scholar
  9. 9.
    Gilks, W.R., Thomas, A., Spiegelhalter, D.J.: A language and program for complex Bayesian modelling. The Statistician 43, 169–178 (1994)CrossRefGoogle Scholar
  10. 10.
    Goodman, N., Mansinghka, V.K., Roy, D.M., Bonawitz, K., Tenenbaum, J.B.: Church: a language for generative models. In: Uncertainty in Artificial Intelligence (UAI 2008), pp. 220–229. AUAI Press (2008)Google Scholar
  11. 11.
    Gordon, A.D., Graepel, T., Rolland, N., Russo, C.V., Borgström, J., Guiver, J.: Tabular: a schema-driven probabilistic programming language. In: Principles of Programming Languages (POPL 2014) (2014)Google Scholar
  12. 12.
    Gordon, A.D., Henzinger, T.A., Nori, A.V., Rajamani, S.K.: Probabilistic programming. In: Herbsleb, J.D., Dwyer, M.B. (eds.) Proceedings of the on Future of Software Engineering, FOSE 2014, Hyderabad, India, 31 May–7 June 2014, pp. 167–181. ACM (2014)Google Scholar
  13. 13.
    Gorinova, M.I., Sarkar, A., Blackwell, A.F., Syme, D.: A live, multiple-representation probabilistic programming environment for novices. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 2533–2537. ACM (2016)Google Scholar
  14. 14.
    Goudie, R., Thomas, A.: MultiBUGS: A parallel implementation of the bugs modelling framework for faster Bayesian inference (2019). talk at workshop on Advances and challenges in Machine Learning Languages. https://people.ds.cam.ac.uk/rg447/2019-05-21-goudie-acmll-slides.pdf
  15. 15.
    Grad, B.: The creation and the demise of VisiCalc. IEEE Ann. Hist. Comput. 29(3), 20–31 (2007)CrossRefGoogle Scholar
  16. 16.
    Hermans, F., Jansen, B., Roy, S., Aivaloglou, E., Swidan, A., Hoepelman, D.: Spreadsheets are code: an overview of software engineering approaches applied to spreadsheets. In: FOSE@SANER, pp. 56–65. IEEE Computer Society (2016)Google Scholar
  17. 17.
    Hyvönen, E., De Pascale, S.: A new basis for spreadsheet computing: interval solver for microsoft excel. AI Magazine 21(4), 83–92 (2000)Google Scholar
  18. 18.
    Ko, A.J., et al.: The state of the art in end-user software engineering. ACM Comput. Surv. (CSUR) 43(3), 21 (2011)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Kulesz, D., Wagner, S.: Asheetoxy: a taxonomy for classifying negative spreadsheet-related phenomena. arXiv preprint arXiv:1808.10231 (2018)
  20. 20.
    Lunn, D., Jackson, C., Best, N., Thomas, A., Spiegelhalter, D.: The BUGS Book. CRC Press, Florida (2013)Google Scholar
  21. 21.
    Martin, O.: Bayesian Analysis with Python: Introduction to statistical modeling and probabilistic programming using PyMC3 and ArviZ, 2nd edn. (2018)Google Scholar
  22. 22.
    McCutchen, M., Borghouts, J., Gordon, A.D., Jones, S.P., Sarkar, A.: Elastic sheet-defined functions: Generalising spreadsheet functions to variable-size input arrays (2019). (In submission)Google Scholar
  23. 23.
    Minka, T., Winn, J., Guiver, J., Zaykov, Y., Fabian, D., Bronskill, J.: Infer.NET 0.3 (2018), Microsoft Research Cambridge https://dotnet.github.io/infer
  24. 24.
    Nardi, B.A., Miller, J.R.: The spreadsheet interface: a basis for end user programming. In: Diaper, D., Gilmore, D.J., Cockton, G., Shackel, B. (eds.) Proceedings of the IFIP TC13 Third Interantional Conference on Human-Computer Interaction, INTERACT 1990, Cambridge, UK, 27–31 August, 1990, pp. 977–983. North-Holland (1990)Google Scholar
  25. 25.
    Ntzoufras, I.: Bayesian Modeling Using WinBUGS. Wiley, Hoboken (2009)Google Scholar
  26. 26.
    Peyton Jones, S.L., Blackwell, A.F., Burnett, M.M.: A user-centred approach to functions in excel. In: International Conference on Functional Programming, pp. 165–176 (2003)CrossRefGoogle Scholar
  27. 27.
    Pfeffer, A.: Figaro: an object-oriented probabilistic programming language. Technical report, Charles River Analytics (2009)Google Scholar
  28. 28.
    Powell, S.G., Baker, K.R.: The Art of Modeling with Spreadsheets. John Wiley & Sons, Inc., New York (2003)Google Scholar
  29. 29.
    Sarkar, A., Gordon, A.D., Peyton Jones, S., Toronto, N.: Calculation view: multiple-representation editing in spreadsheets. In: VL/HCC, pp. 85–93. IEEE Computer Society (2018)Google Scholar
  30. 30.
    Savage, S., Scholtes, S., Zweidler, D.: Probability Management. OR/MS Today (2006)Google Scholar
  31. 31.
    Savage, S.L.: The Flaw of Averages. Wiley, Hoboke (2009)Google Scholar
  32. 32.
    Scaffidi, C., Shaw, M., Myers, B.: Estimating the numbers of end users and end user programmers. In: 2005 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC 2005), pp. 207–214. IEEE (2005)Google Scholar
  33. 33.
    Schunn, C.D., Trafton, J.G.: The psychology of uncertainty in scientific data analysis. In: Handbook of the Psychology of Science. Springer (2012)Google Scholar
  34. 34.
    Sestoft, P.: Spreadsheet Implementation Technology: Basics and Extensions. The MIT Press, Cambridge (2014)CrossRefGoogle Scholar
  35. 35.
    Stolterman, E.: The nature of design practice and implications for interaction design research. Int. J. Des. 2(1), 55–65 (2008)Google Scholar
  36. 36.
    Streit, A.: Encapsulation and abstraction for modeling and visualizing information uncertainty. Ph.D. thesis, Queensland University of Technology (2008)Google Scholar
  37. 37.
    Streit, A., Pham, B., Brown, R.: A spreadsheet approach to facilitate visualization of uncertainty in information. IEEE Trans. Vis. Comput. Graph. 14(1), 61–72 (2008).  https://doi.org/10.1109/TVCG.2007.70426CrossRefGoogle Scholar
  38. 38.
    Tolpin, D., van de Meent, J.-W., Wood, F.: Probabilistic programming in Anglican. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 308–311. Springer, Cham (2015).  https://doi.org/10.1007/978-3-319-23461-8_36CrossRefGoogle Scholar
  39. 39.
    Winston, W.L.: Microsoft Excel 2019 Data Analysis and Business Modeling, 6th edn. Microsoft Press, USA (2019)Google Scholar
  40. 40.
    Wu, M., Perov, Y.N., Wood, F.D., Yang, H.: Spreadsheet probabilistic programming. CoRR abs/1606.04216 (2016), (see also the Scenarios tool at. invrea.com)

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Judith Borghouts
    • 1
  • Andrew D. Gordon
    • 1
  • Advait Sarkar
    • 1
    Email author
  • Neil Toronto
    • 1
  1. 1.Microsoft ResearchCambridgeUK

Personalised recommendations