Abstract
This chapter provides an overview of the principal aspects of statistical inference as it is encountered in epidemiology. It makes minimal assumptions about the theoretical background and attempts to provide a simple introduction to both the frequentist and Bayesian approaches. The former still predominates in the applied literature, while theoreticians increasingly argue that it should be replaced by the latter, Bayesian approach. In this chapter, we take the view that these two approaches to inference should not be regarded as antagonistic, but rather as aiming at fundamentally different objectives, which may be regarded respectively as the analysis of data and the weighing of evidence – the latter typically derived from essentially different sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
This most useful word means a “quantity constant in (the) case considered, but varying in different cases” (Concise Oxford Dictionary). Its use to mean “boundary” (perhaps by confusion with “perimeter”), “limits” or “constraints” is to be deprecated.
- 2.
Mathematically speaking, a random variable is a function associating each possible outcome of an experiment with a particular numerical value, but at an elementary level, we can take it to be what it sounds like – a quantity that is variable from case to case in a manner that is intrinsically random.
- 3.
We follow the usual notational convention that a random variable is denoted by an uppercase letter and a specific value by a lowercase letter. \(\mathrm{P}[\cdot ]\) denotes the probability of an event.
- 4.
The variance is the square of the SD and is a measure of spread, or dispersion, that is more convenient than the SD for some purposes.
- 5.
This key result in statistical inference was first expounded by the Rev. Thomas Bayes (1701–1761), an English Presbyterian minister.
- 6.
An increasing transformation of x returns a value f(x) with the property that f(x 2) > f(x 1) whenever x 2 > x 1.
- 7.
In other cases, it may be difficult or impossible to determine exactly, but the important Central Limit Theorem ensures that it will be very nearly normal in most circumstances.
- 8.
Technically, the precision is the reciprocal of the square of the s.e., that is, of the variance of the sampling distribution.
- 9.
Interestingly, because the unbiased estimator has slightly greater variability, neither this nor the MLE has the smallest mean squared error – the estimator with this property is Q 2 ∕ (n + 1).
- 10.
The chi-squared distribution with k d.f., denoted by \(\chi _{k}^{2}\), may be defined as the distribution of the sum of squares of k independent variates having the standard normal distribution.
- 11.
Karl Pearson (1857–1936) effectively founded the discipline of scientific statistics at University College, London, in the first quarter of the twentieth century; his son E.S. Pearson was the first Professor of Statistics in the same college.
- 12.
Student was the pen name of W.S. Gosset (1876–1937), who worked for Guinness, by whom he was not permitted to publish the results of his work.
- 13.
In an epidemiological context, see, for example, the book by Clayton and Hills (1993).
- 14.
It is an interesting mathematical property of the \(\chi _{2}^{2}\)-distribution that this value of ρ is exact, and indeed, the plausibility fraction for a two-dimensional 1 − α confidence region is α for anyα, at least to the extent that 2Λ is approximately distributed as \(\chi _{2}^{2}\).
- 15.
logit(p) is defined as log(p ∕ (1 − p)).
- 16.
In spite of the usefulness of this approach, this expression does not appear regularly in the literature as yet.
- 17.
It is relatively easy to write a function performing these calculations in R, for example.
References
Agresti A, Gottard (2007) A Nonconservative exact small-sample inference for discrete data. Comput Stat Data Anal 51:6447–6458
Armitage P, Berry G, Matthews JNS (2002) Statistical methods in medical research, 4th edn. Blackwell, Oxford
Bortolussi R, Krishnan C, Armstrong D, Tovichayathamrong P (1978) Prognosis for survival in neonatal meningitis: clinical and pathologic review of 52 cases. Can Med Assoc J 118: 165–168
Clayton D, Hills M (1993) Statistical models in epidemiology. Oxford University Press, Oxford
Cox DR (2006) Principles of statistical inference. Cambridge University Press, Cambridge
Czerniawska J, Bielen P, Plywaczewski R, Czystowska M, Korzybski D, Sliwinski P, Gorecka, D (2008) Metabolic abnormalities in obstructive sleep apnea patients. Pneumonol Alergol Pol 76(5):340–347
Draper GJ, Stiller CA, Cartwright RA, Craft AW, Vincent TJ (1993) Camncer in Cumbira and in the vicinity of the Sellafield nuclear installation, 1963–90. Br Med J 306:89–94
Everitt BS (1994) A handbook of statistical analyses using S-PLUS. Chapman and Hall, London
Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference, 2nd edn. Chapman and Hall/CRC, London
Greenland S (2006) Bayesian perspectives for epidemiological research: I. Foundations and basic methods. Int J Epidemiol 35:765–775
Hoenig JM, Heisey DM (2001) The abuse of power: the pervasive fallacy of power calculations for data analysis. Am Stat 55(1):19–24
Lee PM (2004) Bayesian statistics: an introduction, 2nd edn. Arnold and Oxford University Press, New York
Marriott FHC (1979) Barnard’s Monte Carlo tests: how many simulations? Appl Stat 28(1):75–77
Myint PK, Sinha S, Wareham NJ, Bingham SA, Luben RN, Welch AA, Khaw KT (2007) Glycated hemoglobin and risk of stroke in people without known diabetes in the European Prospective Investigation into Cancer (EPIC)-Norfolk prospective population study – A threshold relationship? Stroke 38(2):271–275
Neyman J, Pearson ES (1933) On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A 231:289–337
Rice JA (2007) Mathematical statistics and data analysis. Third edition. Thomson/Brooks Cole. Belmont, California
Simon D, Senan C, Garnier P, Saint-Paul M, Papoz L (1989) Epidemiological features of glycated haemoglobin Alc-distribution in a healthy population: the Telecom study. Diabetologia 32: 864–869
Washburn TC, Medearis DN, Childs B (1965) Sex differences in susceptibility to infections. Pediatrics 35:57–64
Wilcoxon, F (1945) Individual comparisons by ranking methods. Biom Bull 1:80–83
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this entry
Cite this entry
Bithell, J.F. (2014). Statistical Inference. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_54
Download citation
DOI: https://doi.org/10.1007/978-0-387-09834-0_54
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-09833-3
Online ISBN: 978-0-387-09834-0
eBook Packages: MedicineReference Module Medicine