Advertisement

The choice of sample size in estimating entropy according to a stratified sampling

  • María Angeles Gil
  • Covadonga Caso
Information Theory
Part of the Lecture Notes in Computer Science book series (LNCS, volume 313)

Abstract

Havrda and Charvát (1967) and Daróczy (1970) introduced the nonadditive entropies of order α, each one of them can be regarded as a function defined on the class of distributions associated with a given population and quantifying, in a certain sense, the similarity between the uniform distribution and the considered one.

When the population is finite but too large to be censused, the entropy of order α = 2 may be unbiasedly estimated from a sample drawn at random from it. This estimation becomes specially useful when entropy is conceived as a measure of diversity within the population. In such a case, the populations to which this estimation is usually applied (e.g., anthropological, ecological, industrial, linguistic and sociological populations) often arise naturally stratified.

In the present paper, we are first going to discuss the problem of choosing a suitable sample size to estimate entropy on the basis of the information supplied by a pilot survey or a previous sample, drawn at random according to a stratified sampling with proportional allocation from the same or a similar population. We then establish an alternative and conservative criterion to choose the sample size, on the basis of the asymptotic distribution of the sample entropy.

Keywords

Mutual Information Asymptotic Distribution Stratify Sampling Unbiased Estimator Stratify Random Sampling 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. BHARGAVA, T.N. and UPPULURI, V.R.R. (1977). "Sampling Distribution of Gini's Index of Diversity,", Appl. Math. Comput., 3, 1–24.Google Scholar
  2. CASO, C. and GIL, M.A. (1988a). "The Gini-Simpson index of Diversity: Estimation in the stratified sampling," Commun. Statist.— Theor. Method., (accepted for publication).Google Scholar
  3. DAGET, Ph. et GODRON, M. (1982). Analyse Fréquentielle de L'Écologie. Paris: Masson.Google Scholar
  4. DAROCZY, Z. (1970). "Generalized Information Functions," Inf. and Contr., 16, 36–51.Google Scholar
  5. EMPTOZ, H. (1976). Informations Utiles et Pseudoquestionnaires, PhD Thesis, Lyon: Université Claude-Bernard.Google Scholar
  6. GIL, M.A. and CASO, C. (1987a). "A note on the Estimation of the Mutual Information in the Stratified Sampling," Metron, 45, 295–301.Google Scholar
  7. GIL, M.A., FERNANDEZ, M.J. and MARTINEZ, I. (1988b). "The choice of the sample size in estimating the mutual information," Appl. Math. Comp., (accepted for publication).Google Scholar
  8. GIL, M.A., PEREZ, R. and GIL, P. (1987b). "The Mutual Information. Estimation in the Sampling Without Replacement," Kybernetika, 23, 5, 406–419.Google Scholar
  9. GIL, M.A., PEREZ, R. and MARTINEZ, I. (1986). "The Mutual Information. Estimation in the Sampling With Replacement," R.A.I.R.O.-Rech, Opér., 20, 3, 257–268.Google Scholar
  10. GINI, C. (1912). Variabilitá e mutabilitá. Studi Economico-Giuridici della facolta di Giurisprudenza dell Universita di Cagliari, a III, Parte II.Google Scholar
  11. HAVRDA, J. and CHARVAT, F. (1967). "Quantification method of classification processes," Kybernetika, 3, 30–35.Google Scholar
  12. LOMNICKI, Z. and ZAREMBA, S. (1959). "The Asymptotic Distributions of Estimators of the Amount of Transmitted Information," Inf. and Contr., 2, 260–284.Google Scholar
  13. MARGALEF, D.R. (1958). "Information Theory in Ecology," General Systems, 3, 36–71.Google Scholar
  14. MARTINEZ, I., PEREZ, R. y GIL, M.A. (1985). "Simulación de Montecarlo para la comparación de las estimaciones de las entropías cuadrática y de Shannon en el muestreo con reemplazamiento," Actas XV Reunión Nacional de Estadística e Investigación Operativa, pp. 436–444.Google Scholar
  15. MATHAI, A.M. and RATHIE, P.N. (1975). Basic Concepts in Information Theory and Statistics. New Delhi: Wiley Eastern Limited.Google Scholar
  16. NAYAK, T.K. (1985). "On Diversity Measures based on Entropy Functions," Commun. Statist.-Theor. Meth., 14, 1, 203–215.Google Scholar
  17. PEREZ, R., CASO, C. and GIL, M.A. (1986a). "Unbiased Estimation of Income Inequality," Statistische Hefte, 27, 227–237.Google Scholar
  18. PEREZ, R., GIL, M.A. and GIL, P. (1986b). "Estimating the Uncertainty associated with a Variable in a Finite Population", Kybernetes, 15, 251–256.Google Scholar
  19. PIELOU, E.C. (1975). Ecological Diversity. New York: Wiley Interscience.Google Scholar
  20. RAO, C.R. (1982a). "Gini-Simpson Index of Diversity; a characterization, generalization and applications," Utilitas Mathematika, 21B, 273–282.Google Scholar
  21. RAO, C.R. (1982b). "Diversity: its measurement, decomposition, apportionment and analysis," Sankhya, Ser. A, 44, 1, 1–22.Google Scholar
  22. RAO, C.R. (1982c). "Diversity and dissimilarity coefficients: a unified approach," Theo. Popin. Biol., 21, 24–43.Google Scholar
  23. RATHIE, P.N. and KANNAPPAN, PL. (1973). "An Inaccuracy Function of type β," Ann. Inst. Statist. Math., 25, 205–214.Google Scholar
  24. ROUTLEDGE, R.D. (1979). "Diversity indices: Which ones are admissible?," J. Theor. Biol., 76, 503–515.Google Scholar
  25. ROUTLEDGE, R.D. (1984). "Estimating ecological components of diversity," OIKOS, 42, 23–29.Google Scholar
  26. SHANNON, C.E. (1948). "A mathematical Theory of Communications," Bell System Tech. J., 27, 369–423, 623–656.Google Scholar
  27. SIMPSON, E. H. (1949). "Measurement of Diversity," Nature, 163, 688.Google Scholar
  28. WHITTAKER, R.H. (1975). Communities and Ecosystems. New York: Macmillan, Inc..Google Scholar
  29. ZVAROVA, J. (1973). "On asymptotic behaviour of a sample estimator of Renyl's information of order α," Trans. 6th Prague Conf. on Inf. Theory, Stat. Dec. Func, Rand. Proc., Prague: Czech. Acad. of Sci,pp.919–924.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 1988

Authors and Affiliations

  • María Angeles Gil
    • 1
  • Covadonga Caso
    • 1
  1. 1.Departamento de MatemáticasUniversidad de OviedoOviedoSpain

Personalised recommendations