, Volume 76, Issue 6, pp 799–818 | Cite as

Estimation of a proportion in survey sampling using the logratio approach

  • Karel HronEmail author
  • Matthias Templ
  • Peter Filzmoser


The estimation of a mean of a proportion is a frequent task in statistical survey analysis, and often such ratios are estimated from compositions such as income components, wage components, tax components, etc. In practice, the weighted arithmetic mean is regularly used to estimate the center of the data. However, this estimator is not appropriate if the ratios are estimated from compositions, because the sample space of compositional data is the simplex and not the usual Euclidean space. We demonstrate that the weighted geometric mean is useful for this purpose. Even for different sampling designs, the weighted geometric mean shows excellent behavior.


Survey sampling Proportions Compositional data  Logratio analysis 



We want to thank Prof. Anne Ruiz-Gazen (Toulouse School of Economics) for helpful suggestions. The authors would also like to thank the anonymous reviewers for their valuable comments and suggestions to improve the paper. The authors also gratefully acknowledge the support by the Operational Program Education for Competitiveness - European Social Fund (project CZ.1.07/2.3.00/20.0170 of the Ministry of Education, Youth and Sports of the Czech Republic).


  1. Agresti A (2002) Categorial data analysis, 2nd edn. Wiley, New YorkCrossRefGoogle Scholar
  2. Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman& Hall Ltd., London. (Reprinted in 2003 with additional material by The Blackburn Press)Google Scholar
  3. Alfons A, Burgard JP, Filzmoser P, Hulliger B, Kolb J-P, Kraft S, Münnich R, Schoch T, Templ M (2011a) The AMELI simulation study. Research project report WP6—D6.1, FP7-SSH-2007-217322 AMELI.
  4. Alfons A, Filzmoser P, Hulliger B, Kolb J-P, Kraft S, Münnich M, Templ M (2011b) Synthetic data generation of SILC data. Research project report WP6—D6.2, FP7-SSH-2007-217322 AMELI.
  5. Alfons A, Kraft S, Templ M, Filzmoser P (2011c) Simulation of close-to-reality population data for household surveys with application to EU-SILC. Stat Methods Appl 20(3):383–407Google Scholar
  6. Atkinson T, Cantillon B, Marlier E, Nolan B (2002) Social indicators: the EU and social inclusion. Oxford University Press, New YorkCrossRefGoogle Scholar
  7. Cassel C, Sarndal C, Wretman HH (1977) Foundations of inference in survey sampling. Wiley, New YorkzbMATHGoogle Scholar
  8. Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New YorkzbMATHGoogle Scholar
  9. Egozcue JJ (2009) Reply to “On the Harker variation diagrams.” by J. A. Cortés. Math Geosci 41(7):829–834Google Scholar
  10. Egozcue JJ, Pawlowsky-Glahn V (2006) Simplicial geometry for compositional data. In: Buccianti A, Mateu-Figueras G, Pawlowsky-Glahn V (eds) Compositional data analysis in the geosciences: from theory to practice, special publications, vol 264. Geological Society, London, pp 145–160Google Scholar
  11. Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300MathSciNetCrossRefGoogle Scholar
  12. EU-SILC (2009) Algorithms to compute social inclusion indicators based on EU-SILC and adopted under the Open Method of Coordination (OMC). EU-SILC LC-ILC/39/09/EN-rev.1, Directorate F: Social and information society statistics Unit F-3: Living conditions and social protection, EUROPEAN COMMISSION, EUROSTAT, LuxembourgGoogle Scholar
  13. Directorate F: Social Statistics European Commission, Eurostat (2009) Information Society Unit F-3: Living conditions, and social protection statistics. Description of SILC user database variables: cross-sectional and longitudinal Version 2007.2Google Scholar
  14. Eurostat (2004) Description of target variables: cross-sectional and longitudinal. EU-SILC 065/04, Eurostat, LuxembourgGoogle Scholar
  15. Filzmoser P, Hron K, Reimann C (2009) Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Sci Total Environ 407:6100–6108CrossRefGoogle Scholar
  16. Gabler S, Ganninger M, Münnich R (2010) Optimal allocation of the sample size to strata under box constraints. Metrika 75(2):151–161CrossRefGoogle Scholar
  17. Graf M (2006a) Precision of compositional data in a stratified two-stage cluster sample: comparison of the Swiss Earnings Structure Survey 2002 and 2004. 2006a. Joint statistical meetingGoogle Scholar
  18. Graf M (2006b) Swiss Earnings Structure Survey 2002–2004. Compositional data in a stratified two-stage sample: analysis and precision assessment of wage components. Technical report 338–0038, Swiss Federal Statistical Office, Neuchâtel CHGoogle Scholar
  19. Graf M (2011) Use of survey weights for the analysis of compositional data. In: Pawlowsky-Glahn and Buccianti pp 114–127Google Scholar
  20. Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685MathSciNetzbMATHCrossRefGoogle Scholar
  21. Hron K, Kubáček L (2011) Statistical properties of the total variation estimator for compositional data. Metrika 74(2):221–230MathSciNetzbMATHCrossRefGoogle Scholar
  22. Kadane JB (2005) Optimal dynamic sample allocation among strata. J Off Stat 21:531–541Google Scholar
  23. Kott P, Liu Y (2009) One-sided coverage intervals for a proportion estimated from a stratified simple random sample. Int Stat Rev 77:251–265CrossRefGoogle Scholar
  24. Kraft S (2009) Simulation of a population for the European income and living conditions survey. Department of Statistics and Probability Theory, Vienna University of Technology, Vienna, Master’s thesisGoogle Scholar
  25. Kraft S, Alfons A (2010) simPopulation: Simulation of synthetic populations for surveys based on sample data. R package version 0.1.1Google Scholar
  26. Leetmaa P, Rennie H (2009) Household saving rate higher in the eu than in the usa despite lower income. household income, saving and investment, 1995–2007. Research report 29/2009, European Commission/EUROSTATGoogle Scholar
  27. Martín-Fernández JA, Palarea-Albaladejo J, Olea RA (2011) Dealing with zeros. In: Pawlowsky-Glahn and Buccianti, pp 43–58Google Scholar
  28. Mateu-Figueras G, Pawlowsky-Glahn V (2008) A critical approach to probability laws in geochemistry. Math Geosci 40(5):489–502zbMATHCrossRefGoogle Scholar
  29. McClelland R, Reinsdorf M (1999) Small sample bias in geometric mean and seasoned CPI component indexes. Technical report working paper 324, U.S. Department of Labor, Bureau of Labor Statistics, p 31Google Scholar
  30. Münnich R, Schürle J (2003) On the simulation of complex universes in the case of applying the German Microcensus. DACSEIS research paper series No. 4, University of TübingenGoogle Scholar
  31. Münnich R, Schürle J, Bihler W, Boonstra H-J, Knotterus P, Nieuwenbroek N, Haslinger A, Laaksonen S, Eckmair D, Quatember A, Wagner H, Renfer J-P, Oetliker U, Wiegert R (2003) Monte Carlo simulation study of European surveys. DACSEIS Deliverables D3.1 and D3.2, University of TübingenGoogle Scholar
  32. Neyman J (1934) On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. J R Stat Soc 97:558–606CrossRefGoogle Scholar
  33. Pawlowsky-Glahn V, Buccianti A (eds) (2011) Compositional data analysis: theory and applications. Wiley, ChichesterzbMATHGoogle Scholar
  34. Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk Assess (SERRA) 15(5):384–398zbMATHCrossRefGoogle Scholar
  35. Pawlowsky-Glahn V, Egozcue JJ (2002) BLU estimators and compositional data. Math Geol 34(3):259–274MathSciNetzbMATHCrossRefGoogle Scholar
  36. Sukhatme BV, Tang VKT (1975) Allocation in stratified sampling subsequent to preliminary test of significance. J Am Stat Assoc 70:175–179MathSciNetzbMATHCrossRefGoogle Scholar
  37. Thompson SK (2002) Sampling, 2nd edn. Wiley, New YorkzbMATHGoogle Scholar
  38. Tufte ER (2001) The visual display of quantitative information, 2nd edn. Graphics Press, CheshireGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2012

Authors and Affiliations

  1. 1.Department of Mathematical Analysis and Applications of Mathematics, Faculty of SciencePalacký University OlomoucCzech Republic
  2. 2.Department of Geoinformatics, Faculty of SciencePalacký UniversityOlomoucCzech Republic
  3. 3.Department of Statistics and Probability TheoryVienna University of TechnologyViennaAustria
  4. 4.Department of Statistics and Probability TheoryVienna University of TechnologyViennaAustria

Personalised recommendations