Abstract
In data analysis of social, economic and technical fields, compositional data is widely used in problems of proportions to the whole. This paper develops regression modelling methods of compositional data, discussing the relationships of one compositional data to one or more than one compositional data and the interrelationship of multiple compositional data. By combining centered logratio transformation proposed by Aitchison (The Statistical Analysis of Compositional Data, Chapman and Hall, 1986) with Partial Least Squares (PLS) related techniques, that is PLS regression, hierarchical PLS and PLS path modelling, respectively, particular difficulties in compositional data regression modelling such as sum to unit constraint, high multicollinearity of the transformed compositional data and hierarchical relationships of multiple compositional data, are all successfully resolved; moreover, the modelling results rightly satisfies the theoretical requirement of logcontrast. Accordingly, case studies of employment structure analysis of Beijing’s three industries also illustrate high goodness-of-fit and powerful explainability of the models.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aitchison, J. (1986). The statistical analysis of compositional data. London: Chapman and Hall.
Bayol, M. P., Delafoye, A., Tellier, C., & Tenenhaus, M. (2000). Use of PLS path modeling to estimate the European consumer satisfaction index (ECSI) model. Statistica Applicata - Italian Journal of Applied Statistics, 12(3), 361–375.
Chin, W. W. (1998). The partial least squares approach for structural equation modeling. In: G. A. Marcoulides (Ed.), Modern methods for business research. NJ, USA: Lawrence Erlbaum Associates.
Chin, W. W., & Newsted, P. R. (1999). Structural equation modeling analysis with small samples using partial least squares. In: R. Hoyle (Ed.), Statistical strategies for small sample research. Thousand Oaks, CA, USA: Sage.
Eriksson, L., Johansson, E., Wold, N., & Wold, S. (2001). Multi- and megavariate data analysis–principles and applications. Umea, Sweden: Umetrics AB.
Ferrers, N. M. (1866). An elementary treatise on trilinear coordinates. London: Macmillan.
Guinot, C., Latreille, J., & Tenenhaus, M. (2001). PLS path modellind and multiple table analysis. Application to the cosmetic habits of women in Ile-de-France.Chemometrics and Intelligent Laboratory Systems, 58, 247–259.
Johan, A. W., Theodora, K., & John, F. M. (1998). Analysis of Multiblock and Hierarchical PCA and PLS models. Journal of Chemometrics, 12, 301–321.
Joreskog, K. G. (1970). A general method for analysis of covariance structure. Biometrika, 57, 239–251.
Lohmoller, J. B. (1984). LVPLS Program Manual Version 1.6. Zentralarchiv fr Empirische Sozialforschung. Koln: Universitat zu Koln.
Lohmoller, J. B. (1989). Latent variables path modeling with partial least squares. Heidelberg: Physica.
Noonan, R., & Wold, H. (1982). PLS path modeling with indirectly observed variables: a comparison of alternative estimates for the latent variable. In: K. G. Joreskog, & H. Wold (Eds.), Systems under indirect observation. Amsterdam: North-Holland.
Pages, J., & Tenenhaus, M. (2001). Multiple factor analysis combined with PLS path modelling. Application to the analysis of relationships between physicochemical variables, sensory profiles and hedonic judgements. Chemometrics and Intelligent Laboratory Systems, 58, 261–273.
Pearson, K. (1897). On a form of spurious correlation which may arise when indices are used in the measurement of organs. In: M. Emery (Ed.), Mathematical contributions to the theory of evolution. London: Proceedings of the Royal Society.
Wang, H. W. (1999). Partial least-squares regression method and application. Beijing: National Defence Industry Press.
Wold, H. (1982). Soft modeling: the basic design and some extensions. In: K. G. Joreskog, & H. Wold (Eds.), Systems under indirect observation, Part 2. Amsterdam: North-Holland.
Wold, H. (1985). Partial least squares. In: S. Kotz, & N. L. Johnson (Eds.), Encyclopedia of statistical sciences. New York: Wiley.
Wold, S., Wold, K., & Tjessem, K. (1996). Hierarchical Multiblock PLS and PC Models for easier model interpretation and as an alternative to variable Selection. Journal of Chemometrics, 10, 463–482.
Wold, S., Trygg, J., Berglund, A., & Antti, H. (2001). Some recent developments in PLS modeling. Chemometrics and Intelligent Laboratory Systems, 58, 131–150.
Zhang, Y. T. (2000). The statistical analysis generality of compositional data. Beijing: Science.
Acknowledgements
This research was supported by the National Natural Science Foundation of China (NSFC) under grant number 70125003, 70371007, 70531010 and 70521001.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Wang, H., Meng, J., Tenenhaus, M. (2010). Regression Modelling Analysis on Compositional Data. In: Esposito Vinzi, V., Chin, W., Henseler, J., Wang, H. (eds) Handbook of Partial Least Squares. Springer Handbooks of Computational Statistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-32827-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-540-32827-8_18
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32825-4
Online ISBN: 978-3-540-32827-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)