Probability Distributome: a web computational infrastructure for exploring the properties, interrelations, and applications of probability distributions
Probability distributions are useful for modeling, simulation, analysis, and inference on varieties of natural processes and physical phenomena. There are uncountably many probability distributions. However, a few dozen families of distributions are commonly defined and are frequently used in practice for problem solving, experimental applications, and theoretical studies. In this paper, we present a new computational and graphical infrastructure, the Distributome, which facilitates the discovery, exploration and application of diverse spectra of probability distributions. The extensible Distributome infrastructure provides interfaces for (human and machine) traversal, search, and navigation of all common probability distributions. It also enables distribution modeling, applications, investigation of inter-distribution relations, as well as their analytical representations and computational utilization. The entire Distributome framework is designed and implemented as an open-source, community-built, and Internet-accessible infrastructure. It is portable, extensible and compatible with HTML5 and Web2.0 standards (http://Distributome.org). We demonstrate two types of applications of the probability Distributome resources: computational research and science education. The Distributome tools may be employed to address five complementary computational modeling applications (simulation, data analysis and inference, model-fitting, examination of the analytical, mathematical and computational properties of specific probability distributions, and exploration of the inter-distributional relations). Many high school and college science, technology, engineering and mathematics (STEM) courses may be enriched by the use of modern pedagogical approaches and technology-enhanced methods. The Distributome resources provide enhancements for blended STEM education by improving student motivation, augmenting the classical curriculum with interactive webapps, and overhauling the learning assessment protocols.
KeywordsProbability distributions Models Graphical user interface Transformations Applications Inference Distributome
The development of the Distributome infrastructure was partially supported by NSF Grants, 1023115, 1022560, 1022636, 0089377, 9652870, 0442992, 0442630, 0333672, 0716055, and by NIH Grants U54 RR021813, P20 NR015331, U54 EB020406, P50 NS091856, and P30 DK089503.
Significant contributions from Lawrence Moore, David Aldous, Robert Dobrow and James Pitman ensured that the Distributome infrastructure is generic, complete and extensible. The authors also thank Syed Husain, Selvam Palanimalai, John Guo Jun, Philip Chu, Yunzhong He, Yunzhu He, Prarthana Alevoor and Shelley Zhou Yuhao for their ideas and help with development and validation of the Distributome infrastructure. Glen Marian proofread the final manuscript. Journal referees and editorial staff provided valuable suggestions that improved the manuscript.
Conflict of interest
The authors do not have potential conflicts of interest outside of the funding sources referred to in the acknowledgment section. Ethical Standard The results of this research did not involve human participants, animals, or data derived from human or animal studies.
- Consortium for the Advancement of Undergraduate Statistics Education (CAUSE) (2013). Available from: www.causeweb.org
- Dinov I (2006) SOCR: statistics online computational resource. J Stat Softw 16(1):1–16Google Scholar
- Dinov I (2006) Statistics online computational resource. J Stat Softw 16(1):1–16Google Scholar
- Dinov I, Christou N, Sanchez J (2008) Central limit theorem: new SOCR applet and demonstration activity. J Stat Educ 16(2):1–12Google Scholar
- Gelman A et al (2010) Handbook of Markov chain Monte Carlo: methods and applications. Chapman & Hall/CRC, LondonGoogle Scholar
- Kittur A, Chi EH, Suh B (2009) What’s in Wikipedia? Mapping topics and conflict using socially annotated category structure. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACMGoogle Scholar
- Lappin G, Temple S (2006) Radiotracers in drug development. CRC/Taylor & Francis, Boca RatonGoogle Scholar
- Manders KL (1986) What numbers are real? In: PSA: proceedings of the biennial meeting of the Philosophy of Science Association, 1986, pp 253–269Google Scholar
- Milne D, Witten IH (2012) An open-source toolkit for mining Wikipedia. Artif Intell. 194:222–239. http://www.sciencedirect.com/science/article/pii/S000437021200077X
- Musa JD, Okumoto K (1984) A logarithmic Poisson execution time model for software reliability measurement. In: Proceedings of the 7th international conference on Software engineering. IEEE PressGoogle Scholar
- Panfilo G, Tavella P, Zucca C, (2004) Stochastic processes for modelling and evaluating atomic click behavious. In: Ciarlini P, Cox MG, Pavese FG (eds) Advanced mathematical & computational tools in metrology VIGoogle Scholar
- Rule G, Bajzek D, Kessler A (2010) Molecular visualization in STEM education: leveraging Jmol in an integrated assessment platform. In: World conference on E-learning in corporate, government, healthcare, and higher educationGoogle Scholar
- Siegrist K (2004) The probability/statistics object library. J Online Math Its Appl 4:1–12Google Scholar