Probability Theory

Kaptein, Maurits; van den Heuvel, Edwin

doi:10.1007/978-3-030-10531-0_3

Maurits Kaptein¹² &
Edwin van den Heuvel¹³

Part of the book series: Undergraduate Topics in Computer Science ((UTICS))

4387 Accesses

Abstract

Statistics is a science that is concerned with principles, methods, and techniques for collecting, processing, analyzing, presenting, and interpreting (numerical) data. Statistics can be divided roughly into descriptive statistics (Chap. 1) and inferential statistics (Chap. 2), as we have already suggested. Descriptive statistics summarizes and visualizes the observed data. It is usually not very difficult, but it forms an essential part of reporting (scientific) results. Inferential statistics tries to draw conclusions from the data that would hold true for part or the whole of the population from which the data is collected. The theory of probability, which is the topic of the next two theoretical chapters, makes it possible to connect the two disciplines of descriptive and inferential statistics. We have already encountered some ideas from probability theory in the previous chapter. To start with, we discussed the probability of selecting a specific sample \(\pi _k\) and we briefly defined the notion of probability based on the throwing of a dice. In this chapter we work out these ideas more formally and discuss the probabilities of events; we define probabilities and discuss how to calculate with probabilities. In the previous chapter, when discussing bias, we have also encountered the expected population parameter \(\mathbb {E}(T)\), but we have not yet detailed what expectations are exactly; this is something we cover in Chap. 4.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
It should be noted here that a probability of zero does not necessarily mean that the event will never occur. This seems contradictory, but we will explain this later. On the other hand, if the event can never occur, the probability is zero.
2.
Using definition Eq. (3.3) we can write \(\Pr (A\cap B)\) as \(\Pr (A|B)\Pr (B)\), as we did in Table 3.1, but also as \(\Pr (B|A)\Pr (A)\). Which one to use mostly depends on the practical situation. In Table 3.1 we could have used \(\Pr (B|A)\Pr (A)\) as well.
3.
If, in this case, the population size(s) were known, we could calculate weighted averages to estimate the population parameters as we did in Chap. 2.
4.
Note that Simpson’s Paradox, and its solutions, are still heavily debated (see, Armistead 2014 for examples).

References

T.W. Armistead, Resurrecting the third variable: a critique of pearl’s causal analysis of Simpson’s paradox. Am. Stat. 68(1), 1–7 (2014)
Article MathSciNet Google Scholar
C.R. Charig, D.R. Webb, S.R. Payne, J.E. Wickham, Comparison of treatment of renal calculi by open surgery, percutaneous nephrolithotomy, and extracorporeal shockwave lithotripsy. Br. Med. J. (Clin. Res. Ed.) 292(6524), 879–882 (1986)
Article Google Scholar
G. Grimmett, D. Stirzaker et al., Probability and Random Processes (Oxford University Press, Oxford, 2001)
Google Scholar
N.P. Jewell, Statistics for Epidemiology (Chapman and Hall/CRC, Boca Raton, 2003)
Google Scholar
R. Lanting, E.R. Van Den Heuvel, B. Westerink, P.M. Werker, Prevalence of dupuytren disease in the Netherlands. Plast. Reconstr. Surg. 132(2), 394–403 (2013)
Article Google Scholar
K.J. Rothman, S. Greenland, T.L. Lash et al., Modern Epidemiology, vol. 3 (Wolters Kluwer Health/Lippincott Williams & Wilkins, Philadelphia, 2008)
Google Scholar
E.H. Simpson, The interpretation of interaction in contingency tables. J. Roy. Stat. Soc.: Ser. B (Methodol.) 13(2), 238–241 (1951)
MathSciNet MATH Google Scholar
E.P. Veening, R.O.B. Gans, J.B.M. Kuks, Medische Consultvoering (Bohn Stafleu van Loghum, Houten, 2009)
Google Scholar
E. White, B.K. Armstrong, R. Saracci, Principles of Exposure Measurement in Epidemiology: Collecting, Evaluating and Improving Measures of Disease Risk Factors (OUP, Oxford, 2008)
Google Scholar
F.N. David, Studies in the History of Probability and Statistics I. Dicing and Gaming (A Note on the History of Probability). Biometrika, 42(1/2), 1–5 (1955)
Google Scholar
O.B. Sheynin, Early history of the theory of probability. Archive for History of Exact Sciences, 17(3), 201–259 (1977)
Google Scholar
S.M. Stigler, Studies in the History of Probability and Statistics. XXXIV: Napoleonic statistics: The work of Laplace. Biometrika, 62(2), 503–517 (1975)
Google Scholar

Download references

Author information

Authors and Affiliations

Tilburg University, Tilburg, Noord-Brabant, The Netherlands
Maurits Kaptein
Department of Mathematics and Computer Science, Eindhoven University of Technology, Eindhoven, Noord-Brabant, The Netherlands
Edwin van den Heuvel

Authors

Maurits Kaptein
View author publications
You can also search for this author in PubMed Google Scholar
Edwin van den Heuvel
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Maurits Kaptein .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Kaptein, M., van den Heuvel, E. (2022). Probability Theory. In: Statistics for Data Scientists . Undergraduate Topics in Computer Science. Springer, Cham. https://doi.org/10.1007/978-3-030-10531-0_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-10531-0_3
Published: 02 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10530-3
Online ISBN: 978-3-030-10531-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics