Abstract
In this work, a parametric approach for replacing data below the detection limit, also known as rounded zeros, in compositional data sets is proposed. Compositional rounded zeros correspond to small proportions of some whole that cannot be reliably detected by the analytical instruments under given operating conditions. This kind of zeros appear frequently in the data collection process in geosciences. They must be treated in an adequate way before some multivariate analysis can be applied. Our procedure results from a modification of the Expectation-Maximization (EM) algorithm and is based on the additive log-ratio transformation. Its coherence with the nature of compositional data and with basic operations in the simplex sample space is checked. Using real data sets, we find that this approach improves other parametric and non-parametric techniques for compositional rounded zeros.
Similar content being viewed by others
References
Aitchison J (1986) The statistical analysis of compositional data. Chapman and Hall, London. Reprinted in 2003 by Blackburn Press, 416 p
Aitchison J, Greenacre M (2002) Biplots of compositional data. Appl Stat 51(4):375–392
Aitchison J, Kay JW (2004) Possible solutions of some essential zero problems in compositional data analysis. In: Thió-Henestrosa S, Martín-Fernández JA (eds) Compositional data analysis workshop, Girona, Spain. http://ima.udg.es/Activitats/CoDaWork03/
Aitchison J, Barceló-Vidal C, Martín-Fernández JA, Pawlowsky-Glahn V (2000) Logratio analysis and compositional distance. Math Geol 32(3):271–275
Amemiya T (1984) Tobit models: a survey. J Econom. 24:3–61
Bacon-Shone J (2003) Modelling structural zeros in compositional data. In: Thió-Henestrosa S, Martín-Fernández JA (eds) Compositional data analysis workshop, Girona, Spain. http://ima.udg.es/Activitats/CoDaWork03/
Buccianti A, Rosso F (1999) A new approach to the statistical analysis of compositional (closed) data with observations below the “detection limit”. Geoinformatica 3:17–31
Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood estimation from incomplete data via the EM algorithm (with discussion). J Roy Stat Soc Ser B 39:1–38
Egozcue JJ, Pawlowsky-Glahn V, Mateu-Figueras G, Barceló-Vidal (2003) Isometric logratio transformation for compositional data analysis. Math Geol 35(3):279–300
Fry JM, Fry TRL, McLaren KR (2000) Compositional data analysis and zeros in micro data. Appl Econom 32:953–959
Gómez-García J, Palarea-Albaladejo J, Martín-Fernández JA (2006) Métodos de inferencia estadística con datos faltantes. Estudio de simulación sobre los efectos en las estimaciones. Revista Estadística Española 48(162):241–270
Heckman J (1976) The common structure of statistical models of truncation, sample selection and limited dependent variables, and a simple estimator for such models. Ann Econom Soc Meas 5:475–492
Honaker J, Katz JN, King G (2002) A fast, easy, and efficient estimator for multiparty electoral data. Political Anal 10(1):84–100
King G, Honaker J, Joseph A, Scheve K (2001) Analyzing incomplete political science data: an alternative algorithm for multiple imputation. Am Political Sci Rev 95(1):49–69
Little RJA, Rubin DB (2002) Statistical analysis with missing data. Wiley, New York, 381 p
Martín-Fernández JA, Thió-Henestrosa S (2006) Rounded zeros: some practical aspects for compositional data. In: Buccianti A, Mateu-Figueras G, Pawlowsky-Glahn V (eds) Compositional data analysis: from theory to practice, vol 264. The Geological Society, London, pp 191–201
Martín-Fernández JA, Barceló-Vidal C, Pawlowsky-Glahn V (2000) Zero replacement in compositional data sets. In: Kiers H, Rasson J, Groenen P, Shader M (eds) Studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 155–160
Martín-Fernández JA, Olea-Mensese R, Pawlowsky-Glahn V (2001) Criteria to compare estimation methods of regionalized compositions. Math Geol 33(8):889–909
Martín-Fernández JA, Barceló-Vidal C, Pawlowsky-Glahn V (2003a) Dealing with zeros and missing values in compositional data sets. Math Geol 35(3):253–278
Martín-Fernández JA, Palarea-Albaladejo J, Gómez-García J (2003b) Markov chain Monte Carlo method applied to rounding zeros of compositional data: first approach. In: Thió-Henestrosa S, Martín-Fernández JA (eds) Compositional data analysis workshop, Girona, Spain. http://ima.udg.es/Activitats/CoDaWork03/
Mateu-Figueras G, Barceló-Vidal C (eds) (2005) Second compositional data analysis workshop—CoDaWork’05, Proceedings, Universitat de Girona, CD-ROM, ISBN: 84-8458-222-1; available at http://ima.udg.es/Activitats/CoDaWork05/
Mateu-Figueras G, Pawlowsky-Glahn V (2007) The skew-normal distribution on SD. Special issue: Skew-elliptical distributions and their application. Commun Stat Theory Methods 36(9):1787–1802
McLachlan GJ, Krishnan T (1997) The EM algorithm and extensions. Wiley, New York, 274 p
Palarea-Albaladejo J, Martín-Fernández JA, Gómez-García J (2005) ALR approach for replacing values below the detection limit. In: Mateu-Figueras G, Barceló-Vidal C (eds) Compositional data analysis workshop, Girona, Spain, 2005. http://ima.udg.es/Activitats/CoDaWork05/
Palarea-Albaladejo J, Martín-Fernández JA (2007) A modified EM alr-algorithm for replacing rounded zeros in compositional data sets. Comput Geosci (submitted)
Pawlowsky-Glahn V (guest ed) (2005) Special issue: Advances in compositional data. Math Geol 37(7): 671–850
Rubin DB (1987) Multiple imputation for nonresponse in survey. Wiley, New York, 258 p
Sandford RF, Pierson CT, Crovelli RA (1993) An objective replacement method for censored geochemical data. Math Geol 25(1):59–80
Schafer JL (1997) Analysis of incomplete multivariate data. Chapman and Hall, London, 430 p
Thió-Henestrosa S, Martín-Fernández JA (eds) (2003) Compositional data analysis workshop—CoDaWork’03, Proceedings, Universitat de Girona, CD-ROM, ISBN: 84-8458-111-X; available at http://ima.udg.es/Activitats/CoDaWork03/
Wu CFJ (1983) On the convergence properties of the EM algorithm. Ann Stat 11:95–103
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Palarea-Albaladejo, J., Martín-Fernández, J.A. & Gómez-García, J. A Parametric Approach for Dealing with Compositional Rounded Zeros. Math Geol 39, 625–645 (2007). https://doi.org/10.1007/s11004-007-9100-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11004-007-9100-1