Skip to main content
Log in

Lectures for chemists on statistics II. The normal distribution: a briefer on the univariate case

  • Review
  • Published:
Accreditation and Quality Assurance Aims and scope Submit manuscript

Abstract

Motivated by the introduction of a supplement to the GUM suggesting Monte Carlo simulation as a method for establishing a complete measurement uncertainty budget, some properties of the normal distribution are reviewed. The normal distribution is the central distribution of parametric statistics and sampling from normal distributions is a regularly occurring activity in statistical simulation by Monte Carlo methods. Algorithms for computer generation of normal deviates, areas under the normal curve, and the error function are given to encourage practical simulation. Some critical issues, e.g. coverage and influential observations, are presented. The discussion is restricted to the univariate situation. There is no intention to provide an exhaustive presentation. Statistics is a specialised field requiring sophisticated competences. The text may serve as a golden thread to acquire a quick survey on a subject of general interest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. The term “observation” is used here, and at other locations in this manuscript, in its statistical sense: a result of an experiment or trial, where either numerical or categorical (e.g. head or tail of a tossed coin) may be obtained.

  2. The term “statistic” refers to a quantity characterising a set of observations and which is not dependent on unknown parameters.

  3. Those familiar with these definition will notice that the distinction between weakly and strongly consistent estimators is not elaborated because this text is of an introductory nature.

Abbreviations

N :

Normal distribution

μ :

Location parameter of normal distribution

σ :

Dispersion parameter of normal distribution

x :

arbitrary value, observation

π:

pi (3.1415...)

v :

Variance (v = s 2)

n :

Sample size

x i :

ith sample out of n samples

\(\bar{X}\) :

Mean value

P :

Error function (cumulative normal distribution)

e:

Eulerian number

ɛ i :

Disturbance

e i :

Residual

dz :

d operator; z variable

z :

Variable

Φ :

Cumulative normal distribution

χ 2 :

Chi-square distribution

t :

Student t distribution

m′:

Mid-range

m :

Integer number(s)

E:

Expectation operator

θ, θ′:

Statistic and its estimator

Pr:

Operator (probability)

ε :

Lower bound

M :

Median

g :

A function

F :

Empirical distribution function

y :

Threshold

p :

Probability

erf:

Error function

d max :

Kolmogorov parameter

CI:

Confidence interval

α :

Confidence level

df :

Degrees of freedom

k :

Coverage factor

References

  1. de Moivre A (1733) Approximatio ad summam terminorum binomii (a + b)n in seriem expansi. Reproduced in Archibald RC (1926) Isis 8:671–683

  2. Galton F (1889) Natural inheritance. MacMillan, London

  3. Pearson K (1905) The problem of the random walk. Nature 72:294, 342

    Google Scholar 

  4. Gauß CF (1809) Theoria motus corporum celestium. Perthes, Hamburg

  5. Euler L (1749) Recherches sur la question des inégalités du movement de Saturne et de Jupiter, sujet proposé pour le prix de l’année 1748 par l’académie royale des sciences de Paris. Reprinted in: Leonhardi Euleri Opera Omnia, ser. 2 vol 25. Turici, Basel (1960)

  6. Fisher RA (1922) On the mathematical foundation of theoretical statistics. Phil Trans R Soc A 222:309–368

    Article  Google Scholar 

  7. Thompson M (1994) Statistics—the curse of the analytical classes. Analyst 119:127N

  8. Salsburg DS (1985) The religion of statistics as practised in medical journals. Am Stat 39:220–223

    Article  Google Scholar 

  9. Huber W (2004) On the use of the correlation coefficient r for testing the linearity of calibration functions. Accred Qual Assur 9:726

    Article  Google Scholar 

  10. Hibbert DB (2005) Further comments on the (miss-)use of r for testing the linearity of calibration functions. Accred Qual Assur 10:300–301

    Article  Google Scholar 

  11. Maxwell JC (1860) Illustrations of the dynamical theory of gases. Phil Mag 19:19–32

    Google Scholar 

  12. Meinrath G (2008) Lectures for chemists on statistics. I. Belief, probability, frequency, and statistics: decision making in a floating world. Accred Qual Assur 13:3–9

    Article  CAS  Google Scholar 

  13. Cofino WP, van Stokkum IHM, Wells DE, Ariese F, Wegener JWM, Peerboom RAL (2000) Chemometr Intell Lab Syst 51:37–55

    Google Scholar 

  14. Fearn T (2004) Comments on “Cofino statistics”. Accred Qual Assur 9:441–444

    Google Scholar 

  15. Meinrath G, Kalin M (2005) The role of metrology in making chemistry sustainable. Accred Qual Assur 10:327–337

    Article  CAS  Google Scholar 

  16. Meinrath G, Ekberg C, Landgren A, Liljenzin JO (2000) Assessment of uncertainty in parameter evaluation and prediction. Talanta 51:231–246

    Article  CAS  Google Scholar 

  17. Adams AG (1969) Algorithm 39. Areas under the normal curve. Comput J 12:197–198

    Article  Google Scholar 

  18. Cody WJ (1969) Rational Chebyshev approximation for the error function. Math Comp 22:631–637

    Article  Google Scholar 

  19. Beasley JD, Springer SG (1981) Algorithm AS 111: the percentage points of the normal distribution. Appl Stat 30:118–121

    Google Scholar 

  20. Cody WJ (1990) Performance of programs for the error and complementary error functions. ACM Trans Math Softw 16:29–37

    Article  Google Scholar 

  21. Knuth DE (1972) The art of computer programming, Vol 2. Seminumerical methods. Addison-Wesley, Reading

    Google Scholar 

  22. Bays C, Durham SD (1976) Improving a poor random generator. ACM Trans Math Softw 2:59–64

    Article  Google Scholar 

  23. Chay SC, Fardo RD, Mazumdar M (1975) On using the Box–Muller transformation with multiplicative congruential pseudo-random number generators. Appl Stat 24:132–135

    Article  Google Scholar 

  24. Kankaala K, Ala-Nissila T, Vattulainen I (1993) Bit-level correlations in some pseudorandom number generators. Phys Rev E 48:4211–4214

    Article  Google Scholar 

  25. Hallekalek P (1998) Good random number generators are (not so) easy to find. Math Comp Simul 46:485–505

    Article  Google Scholar 

  26. http://random.mat.sbg.ac.at/links/rando.html (last accessed May 5 2006)

  27. Box GEP, Muller ME (1958) A note on the generation of random normal deviates. Ann Math Stat 29:610–611

    Article  Google Scholar 

  28. Dean RB, Dixon WJ (1951) Simplified statistics for small numbers of observations. Anal Chem 23:636–638

    Article  CAS  Google Scholar 

  29. Rorabacher DB (1991) Statistical treatment of rejection of deviant values: critical values of Dixon’s ‘Q’ parameter and related subrange ratios at the 95% confidence level. Anal Chem 63:139–146

    Article  CAS  Google Scholar 

  30. Beckman RJ, Cook RD (1983) Outlier...s. Technometrics 25:119

    Google Scholar 

  31. This quote is attributed to different statisticians in various versions

  32. Student (Gossett W) (1908) The probable error of mean. Biometrika 6:1–25

    Article  Google Scholar 

  33. Rubinstein RY (1981) Simulation and the Monte Carlo method. Wiley, Chichester, 280 p

  34. Fisher RA (1939) “Student” Ann Eugenics 9:1–9

    Google Scholar 

  35. Meinrath G (2008) Response to a Letter to the Editor. Accred Qual Assur (submitted)

  36. Fisher RA (1951) Statistics. In: Heath EA (ed) Scientific thought in the twentieth century, p 31–55. Watts, London

  37. Kendall MG (1942) On the future of statistics. J R Stat Soc 105:69–80

    Article  Google Scholar 

  38. Rousseeuw PJ, Leroy AM (1987) Robust regression and outlier detection. Wiley, New York

    Google Scholar 

  39. Meinrath G, Schneider P (2007) Quality assurance in chemistry and environmental studies. From pH measurement to nuclear waste disposal. Springer, Berlin Heidelberg New York, 326 p

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Günther Meinrath.

Appendix

Appendix

The programs for calculating a standardized cumulative normal distribution and the error function are written in QBasic and run in a DOS box. QBasic is part of the Windows 95 operating system and is included on most Windows 98 installation disks. It is also available on many Internet sites. There are also compilers available for QBasic. The code given here is intended for demonstration purposes only. All arithmetic operations are made in single precision, even though the introduction of appropriate statements would allow to run them in double precision without requiring complicated modifications. The algorithms are useful for those interested in implementing their own stochastic simulations.

The algorithm is an adaption of Adam’s Algorithm 39 [17]. The code has been tested against other code and tabulated values of the error function.

REM ****** Algorithm 39: Areas under the normal curve *******************

figure a

The second piece of code returns the error function erf(x). It uses the relationship between the cumulative normal distribution and the error function. An alternative Chebyshev polynomial approximation to the area under the normal curve is evaluated. From this value, erf(x) is obtained via Eq. 10.

REM ****** evaluates the error function erf(x) **************

  • CONST a1 = 1.26551223#

  • CONST a2 = 1.00002368#

  • CONST a3 = .37409196#

  • CONST a4 = .09678418#

  • CONST a5 = −.18628806#

  • CONST a6 = .27886807#

  • CONST a7 = −1.13520398#

  • CONST a8 = 1.48851587#

  • CONST a9 = -.82215223#

  • CONST a10 = .17087277#

  • CONST Pi2 = 6.283

  • sqr2 = SQR(2)

  • LOCATE 12, 12

  • INPUT "value, for which erf(x) should be evaluated"; x

  • erfin = x

  • GOSUB 100

  • CLS

  • LOCATE 12, 12

  • PRINT "erf:"; erf

  • END

REM ******* evaluates erf(x) ***********************

  • 100 z = ABS(x)

  •   t = 1! / (1! + .5 * z)

  •   t1 = a5 + t * (a6 + t * (a7 + t * (a8 + t * (a9 + t * a10))))

  •   ans = t * EXP(-z * z - a1 + t * (a2 + t * (a3 + t * (a4 + t * t1))))

  •   IF erfin >= 0 THEN erf = 1! - .5 * ans ELSE erf = .5 * ans

  •   erf = 2 * (erf - .5)

  •   RETURN

Rights and permissions

Reprints and permissions

About this article

Cite this article

Meinrath, G. Lectures for chemists on statistics II. The normal distribution: a briefer on the univariate case. Accred Qual Assur 13, 179–192 (2008). https://doi.org/10.1007/s00769-008-0359-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00769-008-0359-9

Keywords

Navigation