## Abstract

This chapter provides an overview of the principal aspects of statistical inference as it is encountered in epidemiology. It makes minimal assumptions about the theoretical background and attempts to provide a simple introduction to both the frequentist and Bayesian approaches. The former still predominates in the applied literature, while theoreticians increasingly argue that it should be replaced by the latter, Bayesian approach. In this chapter, we take the view that these two approaches to inference should not be regarded as antagonistic, but rather as aiming at fundamentally different objectives, which may be regarded respectively as the analysis of data and the weighing of evidence – the latter typically derived from essentially different sources.

## Access this chapter

Tax calculation will be finalised at checkout

Purchases are for personal use only

### Similar content being viewed by others

## Notes

- 1.
This most useful word means a “quantity constant in (the) case considered, but varying in different cases” (Concise Oxford Dictionary). Its use to mean “boundary” (perhaps by confusion with “perimeter”), “limits” or “constraints” is to be deprecated.

- 2.
Mathematically speaking, a random variable is a function associating each possible outcome of an experiment with a particular numerical value, but at an elementary level, we can take it to be what it sounds like – a quantity that is variable from case to case in a manner that is intrinsically random.

- 3.
We follow the usual notational convention that a random variable is denoted by an uppercase letter and a specific value by a lowercase letter. \(\mathrm{P}[\cdot ]\) denotes the probability of an event.

- 4.
The variance is the square of the SD and is a measure of spread, or dispersion, that is more convenient than the SD for some purposes.

- 5.
This key result in statistical inference was first expounded by the Rev. Thomas Bayes (1701–1761), an English Presbyterian minister.

- 6.
An increasing transformation of

*x*returns a value*f*(*x*) with the property that*f*(*x*_{2}) >*f*(*x*_{1}) whenever*x*_{2}>*x*_{1}. - 7.
In other cases, it may be difficult or impossible to determine exactly, but the important Central Limit Theorem ensures that it will be very nearly normal in most circumstances.

- 8.
Technically, the precision is the reciprocal of the square of the s.e., that is, of the variance of the sampling distribution.

- 9.
Interestingly, because the unbiased estimator has slightly greater variability, neither this nor the MLE has the smallest mean squared error – the estimator with this property is

*Q*^{2}∕ (*n*+ 1). - 10.
The chi-squared distribution with

*k*d.f., denoted by \(\chi _{k}^{2}\), may be defined as the distribution of the sum of squares of*k*independent variates having the standard normal distribution. - 11.
Karl Pearson (1857–1936) effectively founded the discipline of scientific statistics at University College, London, in the first quarter of the twentieth century; his son E.S. Pearson was the first Professor of Statistics in the same college.

- 12.
Student was the pen name of W.S. Gosset (1876–1937), who worked for Guinness, by whom he was not permitted to publish the results of his work.

- 13.
In an epidemiological context, see, for example, the book by Clayton and Hills (1993).

- 14.
It is an interesting mathematical property of the \(\chi _{2}^{2}\)-distribution that this value of

*ρ*is exact, and indeed, the plausibility fraction for a two-dimensional 1 −*α*confidence region is*α*for*anyα*, at least to the extent that 2*Λ*is approximately distributed as \(\chi _{2}^{2}\). - 15.
logit(

*p*) is defined as log(*p*∕ (1 −*p*)). - 16.
In spite of the usefulness of this approach, this expression does not appear regularly in the literature as yet.

- 17.
It is relatively easy to write a function performing these calculations in R, for example.

## References

Agresti A, Gottard (2007) A Nonconservative exact small-sample inference for discrete data. Comput Stat Data Anal 51:6447–6458

Armitage P, Berry G, Matthews JNS (2002) Statistical methods in medical research, 4th edn. Blackwell, Oxford

Bortolussi R, Krishnan C, Armstrong D, Tovichayathamrong P (1978) Prognosis for survival in neonatal meningitis: clinical and pathologic review of 52 cases. Can Med Assoc J 118: 165–168

Clayton D, Hills M (1993) Statistical models in epidemiology. Oxford University Press, Oxford

Cox DR (2006) Principles of statistical inference. Cambridge University Press, Cambridge

Czerniawska J, Bielen P, Plywaczewski R, Czystowska M, Korzybski D, Sliwinski P, Gorecka, D (2008) Metabolic abnormalities in obstructive sleep apnea patients. Pneumonol Alergol Pol 76(5):340–347

Draper GJ, Stiller CA, Cartwright RA, Craft AW, Vincent TJ (1993) Camncer in Cumbira and in the vicinity of the Sellafield nuclear installation, 1963–90. Br Med J 306:89–94

Everitt BS (1994) A handbook of statistical analyses using S-PLUS. Chapman and Hall, London

Gamerman D, Lopes HF (2006) Markov chain Monte Carlo: stochastic simulation for Bayesian inference, 2nd edn. Chapman and Hall/CRC, London

Greenland S (2006) Bayesian perspectives for epidemiological research: I. Foundations and basic methods. Int J Epidemiol 35:765–775

Hoenig JM, Heisey DM (2001) The abuse of power: the pervasive fallacy of power calculations for data analysis. Am Stat 55(1):19–24

Lee PM (2004) Bayesian statistics: an introduction, 2nd edn. Arnold and Oxford University Press, New York

Marriott FHC (1979) Barnard’s Monte Carlo tests: how many simulations? Appl Stat 28(1):75–77

Myint PK, Sinha S, Wareham NJ, Bingham SA, Luben RN, Welch AA, Khaw KT (2007) Glycated hemoglobin and risk of stroke in people without known diabetes in the European Prospective Investigation into Cancer (EPIC)-Norfolk prospective population study – A threshold relationship? Stroke 38(2):271–275

Neyman J, Pearson ES (1933) On the problem of the most efficient tests of statistical hypotheses. Philos Trans R Soc Lond A 231:289–337

Rice JA (2007) Mathematical statistics and data analysis. Third edition. Thomson/Brooks Cole. Belmont, California

Simon D, Senan C, Garnier P, Saint-Paul M, Papoz L (1989) Epidemiological features of glycated haemoglobin Alc-distribution in a healthy population: the Telecom study. Diabetologia 32: 864–869

Washburn TC, Medearis DN, Childs B (1965) Sex differences in susceptibility to infections. Pediatrics 35:57–64

Wilcoxon, F (1945) Individual comparisons by ranking methods. Biom Bull 1:80–83

## Author information

### Authors and Affiliations

## Editor information

### Editors and Affiliations

## Rights and permissions

## Copyright information

© 2014 Springer Science+Business Media New York

## About this entry

### Cite this entry

Bithell, J.F. (2014). Statistical Inference. In: Ahrens, W., Pigeot, I. (eds) Handbook of Epidemiology. Springer, New York, NY. https://doi.org/10.1007/978-0-387-09834-0_54

### Download citation

DOI: https://doi.org/10.1007/978-0-387-09834-0_54

Publisher Name: Springer, New York, NY

Print ISBN: 978-0-387-09833-3

Online ISBN: 978-0-387-09834-0

eBook Packages: MedicineReference Module Medicine