Statistical Inference

Bithell, John F.

doi:10.1007/978-0-387-09834-0_54

John F. Bithell³

11k Accesses

Abstract

This chapter provides an overview of the principal aspects of statistical inference as it is encountered in epidemiology. It makes minimal assumptions about the theoretical background and attempts to provide a simple introduction to both the frequentist and Bayesian approaches. The former still predominates in the applied literature, while theoreticians increasingly argue that it should be replaced by the latter, Bayesian approach. In this chapter, we take the view that these two approaches to inference should not be regarded as antagonistic, but rather as aiming at fundamentally different objectives, which may be regarded respectively as the analysis of data and the weighing of evidence – the latter typically derived from essentially different sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 999.99; Price excludes VAT (USA)

Hardcover Book: USD 1,399.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This most useful word means a “quantity constant in (the) case considered, but varying in different cases” (Concise Oxford Dictionary). Its use to mean “boundary” (perhaps by confusion with “perimeter”), “limits” or “constraints” is to be deprecated.
2.
Mathematically speaking, a random variable is a function associating each possible outcome of an experiment with a particular numerical value, but at an elementary level, we can take it to be what it sounds like – a quantity that is variable from case to case in a manner that is intrinsically random.
3.
We follow the usual notational convention that a random variable is denoted by an uppercase letter and a specific value by a lowercase letter. \(\mathrm{P}[\cdot ]\) denotes the probability of an event.
4.
The variance is the square of the SD and is a measure of spread, or dispersion, that is more convenient than the SD for some purposes.
5.
This key result in statistical inference was first expounded by the Rev. Thomas Bayes (1701–1761), an English Presbyterian minister.
6.
An increasing transformation of x returns a value f(x) with the property that f(x ₂) > f(x ₁) whenever x ₂ > x ₁.
7.
In other cases, it may be difficult or impossible to determine exactly, but the important Central Limit Theorem ensures that it will be very nearly normal in most circumstances.
8.
Technically, the precision is the reciprocal of the square of the s.e., that is, of the variance of the sampling distribution.
9.
Interestingly, because the unbiased estimator has slightly greater variability, neither this nor the MLE has the smallest mean squared error – the estimator with this property is Q ² ∕ (n + 1).
10.
The chi-squared distribution with k d.f., denoted by \(\chi _{k}^{2}\), may be defined as the distribution of the sum of squares of k independent variates having the standard normal distribution.
11.
Karl Pearson (1857–1936) effectively founded the discipline of scientific statistics at University College, London, in the first quarter of the twentieth century; his son E.S. Pearson was the first Professor of Statistics in the same college.
12.
Student was the pen name of W.S. Gosset (1876–1937), who worked for Guinness, by whom he was not permitted to publish the results of his work.
13.
In an epidemiological context, see, for example, the book by Clayton and Hills (1993).
14.
It is an interesting mathematical property of the \(\chi _{2}^{2}\)-distribution that this value of ρ is exact, and indeed, the plausibility fraction for a two-dimensional 1 − α confidence region is α for anyα, at least to the extent that 2Λ is approximately distributed as \(\chi _{2}^{2}\).
15.
logit(p) is defined as log(p ∕ (1 − p)).
16.
In spite of the usefulness of this approach, this expression does not appear regularly in the literature as yet.
17.
It is relatively easy to write a function performing these calculations in R, for example.

References

Agresti A, Gottard (2007) A Nonconservative exact small-sample inference for discrete data. Comput Stat Data Anal 51:6447–6458
Article Google Scholar
Armitage P, Berry G, Matthews JNS (2002) Statistical methods in medical research, 4th edn. Blackwell, Oxford
Book Google Scholar
Bortolussi R, Krishnan C, Armstrong D, Tovichayathamrong P (1978) Prognosis for survival in neonatal meningitis: clinical and pathologic review of 52 cases. Can Med Assoc J 118: 165–168
CAS PubMed Central PubMed Google Scholar
Clayton D, Hills M (1993) Statistical models in epidemiology. Oxford University Press, Oxford
Google Scholar
Cox DR (2006) Principles of statistical inference. Cambridge University Press, Cambridge
Book Google Scholar
Czerniawska J, Bielen P, Plywaczewski R, Czystowska M, Korzybski D, Sliwinski P, Gorecka, D (2008) Metabolic abnormalities in obstructive sleep apnea patients. Pneumonol Alergol Pol 76(5):340–347
CAS PubMed Google Scholar
Draper GJ, Stiller CA, Cartwright RA, Craft AW, Vincent TJ (1993) Camncer in Cumbira and in the vicinity of the Sellafield nuclear installation, 1963–90. Br Med J 306:89–94
Article CAS Google Scholar
Everitt BS (1994) A handbook of statistical analyses using S-PLUS. Chapman and Hall, London
Google Scholar
Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference, 2nd edn. Chapman and Hall/CRC, London
Google Scholar
Greenland S (2006) Bayesian perspectives for epidemiological research: I. Foundations and basic methods. Int J Epidemiol 35:765–775
Article PubMed Google Scholar
Hoenig JM, Heisey DM (2001) The abuse of power: the pervasive fallacy of power calculations for data analysis. Am Stat 55(1):19–24
Article Google Scholar
Lee PM (2004) Bayesian statistics: an introduction, 2nd edn. Arnold and Oxford University Press, New York
Google Scholar
Marriott FHC (1979) Barnard’s Monte Carlo tests: how many simulations? Appl Stat 28(1):75–77
Article Google Scholar
Myint PK, Sinha S, Wareham NJ, Bingham SA, Luben RN, Welch AA, Khaw KT (2007) Glycated hemoglobin and risk of stroke in people without known diabetes in the European Prospective Investigation into Cancer (EPIC)-Norfolk prospective population study – A threshold relationship? Stroke 38(2):271–275
Article CAS PubMed Google Scholar
Neyman J, Pearson ES (1933) On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A 231:289–337
Article Google Scholar
Rice JA (2007) Mathematical statistics and data analysis. Third edition. Thomson/Brooks Cole. Belmont, California
Google Scholar
Simon D, Senan C, Garnier P, Saint-Paul M, Papoz L (1989) Epidemiological features of glycated haemoglobin Alc-distribution in a healthy population: the Telecom study. Diabetologia 32: 864–869
Article CAS PubMed Google Scholar
Washburn TC, Medearis DN, Childs B (1965) Sex differences in susceptibility to infections. Pediatrics 35:57–64
CAS PubMed Google Scholar
Wilcoxon, F (1945) Individual comparisons by ranking methods. Biom Bull 1:80–83
Article Google Scholar

Download references

Author information

Authors and Affiliations

St Peter’s College, University of Oxford, New Inn Hall Street, Oxford, UK
John F. Bithell

Authors

John F. Bithell
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Epidemiological Methods and Etiologic Research, Leibniz Institute for Prevention Research and Epidemiology – BIPS, Bremen, Germany
Wolfgang Ahrens
Department of Biometry and Data Management, Leibniz Institute for Prevention Research and Epidemiology – BIPS, Bremen, Germany
Iris Pigeot

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Bithell, J.F. (2014). Statistical Inference. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_54

Download citation

DOI: https://doi.org/10.1007/978-0-387-09834-0_54
Publisher Name: Springer, New York, NY
Print ISBN: 978-0-387-09833-3
Online ISBN: 978-0-387-09834-0
eBook Packages: MedicineReference Module Medicine

Publish with us

Policies and ethics