Abstract
The estimation of a mean of a proportion is a frequent task in statistical survey analysis, and often such ratios are estimated from compositions such as income components, wage components, tax components, etc. In practice, the weighted arithmetic mean is regularly used to estimate the center of the data. However, this estimator is not appropriate if the ratios are estimated from compositions, because the sample space of compositional data is the simplex and not the usual Euclidean space. We demonstrate that the weighted geometric mean is useful for this purpose. Even for different sampling designs, the weighted geometric mean shows excellent behavior.
Similar content being viewed by others
References
Agresti A (2002) Categorial data analysis, 2nd edn. Wiley, New York
Aitchison J (1986) The statistical analysis of compositional data. Monographs on statistics and applied probability. Chapman& Hall Ltd., London. (Reprinted in 2003 with additional material by The Blackburn Press)
Alfons A, Burgard JP, Filzmoser P, Hulliger B, Kolb J-P, Kraft S, Münnich R, Schoch T, Templ M (2011a) The AMELI simulation study. Research project report WP6—D6.1, FP7-SSH-2007-217322 AMELI. http://ameli.surveystatistics.net
Alfons A, Filzmoser P, Hulliger B, Kolb J-P, Kraft S, Münnich M, Templ M (2011b) Synthetic data generation of SILC data. Research project report WP6—D6.2, FP7-SSH-2007-217322 AMELI. http://ameli.surveystatistics.net
Alfons A, Kraft S, Templ M, Filzmoser P (2011c) Simulation of close-to-reality population data for household surveys with application to EU-SILC. Stat Methods Appl 20(3):383–407
Atkinson T, Cantillon B, Marlier E, Nolan B (2002) Social indicators: the EU and social inclusion. Oxford University Press, New York
Cassel C, Sarndal C, Wretman HH (1977) Foundations of inference in survey sampling. Wiley, New York
Cochran WG (1977) Sampling techniques, 3rd edn. Wiley, New York
Egozcue JJ (2009) Reply to “On the Harker variation diagrams.” by J. A. Cortés. Math Geosci 41(7):829–834
Egozcue JJ, Pawlowsky-Glahn V (2006) Simplicial geometry for compositional data. In: Buccianti A, Mateu-Figueras G, Pawlowsky-Glahn V (eds) Compositional data analysis in the geosciences: from theory to practice, special publications, vol 264. Geological Society, London, pp 145–160
Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal C (2003) Isometric logratio transformations for compositional data analysis. Math Geol 35(3):279–300
EU-SILC (2009) Algorithms to compute social inclusion indicators based on EU-SILC and adopted under the Open Method of Coordination (OMC). EU-SILC LC-ILC/39/09/EN-rev.1, Directorate F: Social and information society statistics Unit F-3: Living conditions and social protection, EUROPEAN COMMISSION, EUROSTAT, Luxembourg
Directorate F: Social Statistics European Commission, Eurostat (2009) Information Society Unit F-3: Living conditions, and social protection statistics. Description of SILC user database variables: cross-sectional and longitudinal Version 2007.2
Eurostat (2004) Description of target variables: cross-sectional and longitudinal. EU-SILC 065/04, Eurostat, Luxembourg
Filzmoser P, Hron K, Reimann C (2009) Univariate statistical analysis of environmental (compositional) data: problems and possibilities. Sci Total Environ 407:6100–6108
Gabler S, Ganninger M, Münnich R (2010) Optimal allocation of the sample size to strata under box constraints. Metrika 75(2):151–161
Graf M (2006a) Precision of compositional data in a stratified two-stage cluster sample: comparison of the Swiss Earnings Structure Survey 2002 and 2004. 2006a. Joint statistical meeting
Graf M (2006b) Swiss Earnings Structure Survey 2002–2004. Compositional data in a stratified two-stage sample: analysis and precision assessment of wage components. Technical report 338–0038, Swiss Federal Statistical Office, Neuchâtel CH
Graf M (2011) Use of survey weights for the analysis of compositional data. In: Pawlowsky-Glahn and Buccianti pp 114–127
Horvitz DG, Thompson DJ (1952) A generalization of sampling without replacement from a finite universe. J Am Stat Assoc 47:663–685
Hron K, Kubáček L (2011) Statistical properties of the total variation estimator for compositional data. Metrika 74(2):221–230
Kadane JB (2005) Optimal dynamic sample allocation among strata. J Off Stat 21:531–541
Kott P, Liu Y (2009) One-sided coverage intervals for a proportion estimated from a stratified simple random sample. Int Stat Rev 77:251–265
Kraft S (2009) Simulation of a population for the European income and living conditions survey. Department of Statistics and Probability Theory, Vienna University of Technology, Vienna, Master’s thesis
Kraft S, Alfons A (2010) simPopulation: Simulation of synthetic populations for surveys based on sample data. R package version 0.1.1
Leetmaa P, Rennie H (2009) Household saving rate higher in the eu than in the usa despite lower income. household income, saving and investment, 1995–2007. Research report 29/2009, European Commission/EUROSTAT
Martín-Fernández JA, Palarea-Albaladejo J, Olea RA (2011) Dealing with zeros. In: Pawlowsky-Glahn and Buccianti, pp 43–58
Mateu-Figueras G, Pawlowsky-Glahn V (2008) A critical approach to probability laws in geochemistry. Math Geosci 40(5):489–502
McClelland R, Reinsdorf M (1999) Small sample bias in geometric mean and seasoned CPI component indexes. Technical report working paper 324, U.S. Department of Labor, Bureau of Labor Statistics, p 31
Münnich R, Schürle J (2003) On the simulation of complex universes in the case of applying the German Microcensus. DACSEIS research paper series No. 4, University of Tübingen
Münnich R, Schürle J, Bihler W, Boonstra H-J, Knotterus P, Nieuwenbroek N, Haslinger A, Laaksonen S, Eckmair D, Quatember A, Wagner H, Renfer J-P, Oetliker U, Wiegert R (2003) Monte Carlo simulation study of European surveys. DACSEIS Deliverables D3.1 and D3.2, University of Tübingen
Neyman J (1934) On the two different aspects of the representative method: the method of stratified sampling and the method of purposive selection. J R Stat Soc 97:558–606
Pawlowsky-Glahn V, Buccianti A (eds) (2011) Compositional data analysis: theory and applications. Wiley, Chichester
Pawlowsky-Glahn V, Egozcue JJ (2001) Geometric approach to statistical analysis on the simplex. Stoch Environ Res Risk Assess (SERRA) 15(5):384–398
Pawlowsky-Glahn V, Egozcue JJ (2002) BLU estimators and compositional data. Math Geol 34(3):259–274
Sukhatme BV, Tang VKT (1975) Allocation in stratified sampling subsequent to preliminary test of significance. J Am Stat Assoc 70:175–179
Thompson SK (2002) Sampling, 2nd edn. Wiley, New York
Tufte ER (2001) The visual display of quantitative information, 2nd edn. Graphics Press, Cheshire
Acknowledgments
We want to thank Prof. Anne Ruiz-Gazen (Toulouse School of Economics) for helpful suggestions. The authors would also like to thank the anonymous reviewers for their valuable comments and suggestions to improve the paper. The authors also gratefully acknowledge the support by the Operational Program Education for Competitiveness - European Social Fund (project CZ.1.07/2.3.00/20.0170 of the Ministry of Education, Youth and Sports of the Czech Republic).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hron, K., Templ, M. & Filzmoser, P. Estimation of a proportion in survey sampling using the logratio approach. Metrika 76, 799–818 (2013). https://doi.org/10.1007/s00184-012-0416-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00184-012-0416-6