Abstract
We discuss the Multi-hypergeometric and Multinomial distributions and their properties with the focus on exact and large sample inference for comparing two proportions or probabilities from the same or different populations. Relative risks and odds ratios are also considered. Maximum likelihood estimation, asymptotic normality theory, and simultaneous confidence intervals are given for the Multinomial distribution. The chapter closes with some applications to animal populations, including multiple-recapture methods, and the delta method.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Agresti, A. (1999). On logit confidence intervals for the odds ratio with small samples. Biometrics, 55(2), 597–602.
Agresti, A., & Caffo, B. (2000). Simple and effective confidence intervals for proportions and differences of proportions result from adding two successes and two failures. The American Statistician, 54(4), 280–288.
Agresti, A., & Min, Y. (2005). Frequentist performance of Bayesian confidence intervals for comparing proportions in \(2 \times 2\) contingency tables. Biometrics, 61(2), 515–523.
Andrés, A. M., & Tejedor, I. H. (2002). Comment on “equivalence testing for binomial random variables: Which test to use?” The American Statistician, 56(3), 253–254.
Brown, L., & Li, X. (2005). Confidence intervals for two sample binomial distribution. Journal of Statistical Planning and Inference, 130, 359–375.
Darroch, J. N. (1958). The multiple-recapture census. I. Estimation of a closed population. Biometrika, 45, 343–359.
Fagerland, M. W., Lydersen, S., & Laake, P. (2011). Recommended confidence intervals for two independent binomial proportions. Statistical Methods in Medical Research, to appear. doi:10.1177/0962280211415469.
Gart, J. J. (1966). Alternative analyses of contingency tables. Journal of the Royal Statistical Society, Series B, 28, 164–179.
Goodman, L. A. (1965). On simultaneous confidence intervals for multinomial proportions. Technometrics, 7, 247–254.
Hochberg, Y., & Tamhane, A. C. (1987). Multiple comparison procedures. New York: Wiley.
Johnson, N. L., Kotz, S., & Balakrishnan, A. (1997). Discrete multivariate distributions. New York: Wiley.
Katz, D., Baptista, J., Azen, S. P., & Pike, M. C. (1978). Obtaining confidence intervals for the risk ratio in Cohort studies. Biometrics, 34, 469–474.
Krishnamoorthy, K., & Thomson, J. (2002). Hypothesis testing about proportions in two finite populations. The American Statistician, 56(3), 215–222.
Lahiri, S. N., Chatterjee, A., & Maiti, T. (2007). Normal approximation to the hypergeometric distribution in nonstandard cases and a sub-Gaussian Berry-Esseen theorem. Journal of Statistical Planning and Inference, 137, 3570–3590.
Mee, R. W. (1984). Confidence bounds for the difference between two probabilities. Biometrics, 40, 1175–1176.
Miettinen, 0., & Nurminen, M. (1985). Comparative analysis of two rates. Statistics in Medicine, 4, 213–226.
Miller, R. G, Jr. (1981). Simultaneous statistical inference (2nd edn.). New York: Springer-Verlag.
Newcombe, R. G. (1998b). Interval estimation for the difference between two independent proportions: Comparison of eleven methods. Statistics in Medicine, 17(8), 873–890.
Scott, A. J., & Seber, G. A. F. (1983). Difference of proportions from the same survey. The American Statistician, 37(4), 319–320.
Seber, G. A. F. (1982). The estimation of animal abundance and related parameters (2nd edn.). London: Griffin. Also reprinted as a paperback in 2002 by Blackburn Press, Caldwell, NJ.
Seber, G. A. F. (2008). A matrix handbook for statisticians. New York: Wiley.
Seber, G. A. F., & Lee, A. J. (2003). Linear regression analysis (2nd edn.). New York: Wiley.
Wild, C. J., & Seber, G. A. F. (1993). Comparing two proportions from the same survey. The American Statistician, 47(3), 178–181. (Correction: 1994, 48(3):269).
Woolf, B. (1955). On estimating the relation between blood group and disease. Annals of Human Genetics, 19, 251–253.
Author information
Authors and Affiliations
Corresponding author
Appendix: Delta Method
Appendix: Delta Method
In this section we consider a well-known method for finding large sample variances. The theory is then applied to the Multinomial distribution. We also consider functions of Normal random variables.
3.1.1 General Theory
We consider general ideas only without getting too involved with technical details about limits. Let \(X\) be a random variable with mean \(\mu \) and variance \(\sigma _X^2\), and let \(Y=g(X)\) be a “well-behaved” function of \(X\) that has a Taylor expansion
where \(X_0\) lies between \(X\) and \(\mu \) and \(g^{\prime }(\mu )\) is the derivative of \(g\) evaluated at \(X=\mu \). Assuming second order terms can be neglected, we have \(\mathrm{E}(Y)\approx g(\mu )\) and
For example, if \(g(X)=\log X\) then, for large \(\mu \),
If \(\mathbf{X }=(X_1,X_2,\ldots ,X_k)^{\prime }\) is a vector with mean \({\varvec{\mu }}\), then for suitable \(g\),
where \(g_i^{\prime }({\varvec{\mu }})\) is \(\partial g/\partial X_i\) evaluated at \(\mathbf{X }={\varvec{\mu }}\). If second order terms can be neglected, we have
3.1.2 Application to the Multinomial Distribution
Suppose \(\mathbf{X }\) has the Multinomial distribution given by (3.10) and
Then, using the above approach with \(\mu _i=np_i\),
and it can be shown that (Seber 1982, pp. 8–9)
Two cases of interest in this monograph are, \(s=2r=2\) and \(s=2r=4\). In the first case \(g(\mathbf{X })=X_1/X_2\) and
We are particularly interested in \(Y=\log g(\mathbf{X })\), so that from (3.31),
If \(g(\mathbf{X })\) is a product of two such independent ratios from independent Binomial distributions, then we just add two more terms to \(\mathrm{{var}}(Y)\). We can estimate \(\mathrm{{var}} (Y)\) by replacing each \(\mu _i\) by \(X_i\) in (3.35).
Using similar algebra, we find that
3.1.3 Asymptotic Normality
In later chapters we are interested in functions of a maximum likelihood estimator, which we know is asymptotically Normally distributed under fairly general conditions. For example, suppose \(\sqrt{n}(\widehat{{\varvec{\mu }}}_n - {\varvec{\mu }})\) is asymptotically \(N(\mathbf{{0}}, {\varvec{\Sigma }}({\varvec{\mu }}))\). Then using the delta method above, \(\sqrt{n}(g(\widehat{{\varvec{\mu }}})-g({\varvec{\mu }}))\) is asymptotically distributed as \(N(0,\sigma _g^2)\) as \(n\rightarrow \infty \), where
This result also holds if we replace \(g\) by a vector function \(\mathbf{g }\) giving us \(N(\mathbf{{0}}, {{\varvec{\Sigma }}}_{\mathbf{g }})\).
Rights and permissions
Copyright information
© 2013 The Author(s)
About this chapter
Cite this chapter
Seber, G.A.F. (2013). Several Proportions or Probabilities. In: Statistical Models for Proportions and Probabilities. SpringerBriefs in Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-39041-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-39041-8_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-39040-1
Online ISBN: 978-3-642-39041-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)