Abstract
Principal axis methods such as principal component analysis (PCA) and correspondence analysis (CA) are useful for identifying structures in data through interesting planar graphic displays. However, some kinds of data sets can be dealt alternatively with PCA or CA. This paper focuses on methods, such as PCA and CA, and on visual displays. Our aim is to illustrate the implications for a potential user of selecting either method, and its advantages and disadvantages, from an applied point of view. This is a matter covered broadly in textbooks and elsewhere considering theoretical arguments. Our purpose is to contribute to the comparison between these methods, over the same data set, in order to illustrate them for the practitioner. In the first part of this paper we present a novel analytical study of a binary matrix associated with a non-oriented axis-symmetric graph and show that CA outperforms standardized PCA for the reconstitution and visualization of such kind of graphs. In the second part we present a case using real data dealing with the distribution of employees in different economic sectors for the countries of the European Union, analyzed by means of standardized PCA and two-way CA, in order to see the differences between the two methods in practice.
Similar content being viewed by others
References
Bécue-Bertaut, M., Fernández-Aguirre, K., Modroño-Herran, J.I.: Analysis of a mixture of closed and open-ended questions in the case of a multilingual survey. In: Skiadas, C.H. (ed.) Advances in data analysis: theory and applications to reliability and inference, data mining, bioinformatics, lifetime data, and neural networks, pp. 23–34. Birkhäuser, Boston (2009)
Beh, E.: Simple correspondence analysis: a bibliographic review. Int. Stat. Rev. 72(2), 257–284 (2004a)
Beh, E.: A bibliography of the theory and application of correspondence analysis, vol. III (by year). http://www.uws.edu.au/ (2004b)
Benzécri, J.P.: L’Analyse des Données, La Taxinomie (T. I); L’Analyse des Correspondances (T. II). Dunod, Paris (1973)
Blasius, J., Greenacre, M., Groenen, P.J.F., van de Velden, M.: Special issue on correspondence analysis and related methods. Comput. Stat. Data Anal. 53(8), 3103–3257 (2009)
D’Ambra, L., Lauro, N.: Non symmetrical analysis of three way contingency tables. In: Coppi, R., Bolasco, S. (eds.) Multiway data analysis, pp. 301–315. North-Holland Publishing Co., Amsterdam (1989)
Escofier, B., Pagès, J.: Analyses factorielles simples et multiples Objectifs, méthodes et interprétation, 4th edn. Dunod, Paris (2008)
Fernández-Aguirre, K.: Multiple correspondence analysis and related methods. J. Appl. Stat. 34(7), 879–885 (2007)
Gaindegia: Observatory for social and economic development in the Basque Country. http://www.gaindegia.org/en (2009)
Gifi, A.: Non linear multivariate analysis. Wiley, Chichester (1990)
Greenacre, M.J.: Theory and applications of correspondence analysis. Academic Press, London (1984)
Greenacre, M.J.: Biplots in practice, Fundación BBVA. http://www.fbbva.es (2010)
Greenacre, M.J., Blasius, J. (eds.): Multiple correspondence analysis and related methods. Chapman & Hall/CRC, London (2006)
Jolliffe, I.T.: Principal component analysis and exploratory factor analysis. Stat. Methods Med. Res. 1(1), 69–95 (1992)
Jolliffe, I.T.: Principal component analysis. Springer series in statistics, 2nd edn. Springer, New York (2002)
Lauro, N., D’Ambra, L.: L’Analyse non symétrique des correspondances. Data Anal. Inform. III, 433–446 (1984)
Lê, S., Josse, J.: FactoMineR: an R package for multivariate analysis. J. Stat. Softw. 25(1), 1–18. http://www.jstatsoft.org (2008)
Le Roux, B., Rouanet, H.: Geometric data analysis: from correspondence analysis to structured data analysis. Kluwer Academic Publishers, Dordrecht (2004)
Le Roux, B., Rouanet, H.: Multiple correspondence analysis, quantitative applications in the social sciences series, vol. 163. SAGE Publications, Thousand Oaks (2010)
Lebart, L.: Exploratory analysis of large sparse matrices, with application to textual data, COMPSTAT. Physica Verlag, Vienna (1982)
Lebart, L., Morineau, A., Warwick, K.: Multivariate descriptive statistical analysis. Wiley, New York (1984)
Lebart, L., Salem, A., Berry, L.: Exploring textual data. Kluwer Academic Publishers, New York (1998)
Lebart, L., Tabard, N.: Recherches sur la description automatique des données socio-économiques, (CORDES-DGRST, 1973 - CR No. 13/1971). Presented at the third European meeting of the Psychometric Society, Jouy-en-Josas, France, July (1983).
Murtagh, F.: Correspondence analysis and data coding with Java and R. Chapman and Hall/CRC, Boca Raton (2005)
Murtagh, F.: Sparse image and signal processing. Cambridge University Press, Cambridge (2010)
Nenadić, O., Greenacre, M.J.: Correspondence analysis in R, with two- and three-dimensional graphics. The ca package. J. Stat. Softw. 20(3), 1–13. http://www.jstatsoft.org/v20/i03/ (2007)
Pham, N.-K., Morin, A., Gros, P., Le, Q.-T.: Intensive use of factorial correspondence analysis for large scale content-based image retrieval. In: Guillet, F., Ritschard, G., Zighed, D.A., Briand, H. (eds.) Advances in knowledge discovery and management, AKDM’09. Studies in computational intelligence, vol. 292, pp. 57–76. Springer, Berlin/Heidelberg (2010)
R Development Core Team: R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna. ISBN 3-900051-07-0. http://www.R-project.org (2011)
Reed, K.: Multiple correspondence analysis and related methods. Comput. Stat. Data Anal. 53(8), 3245–3257 (2009)
Shepard, R., Carroll, J.D.: Parametric representation of nonlinear data structures. In: Krishnaiah, P.R. (ed.) Multivariate analysis, pp. 561–592. Academic Press, New York (1966)
SPAD Coheris: Spad 7.0. http://www.spadsoft.com
Tenenhaus, M., Young, F.W.: An analysis and synthesis of multiple correspondence analysis, optimal scaling, dual scaling, homogeneity analysis and other methods for quantifying categorical multivariate data. Psychometrika 21, 91–119 (1985)
von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395–416 (2007)
Zárraga, A., Goitisolo, B.: Simultaneous analysis and multiple factor analysis for contingency tables: two methods for the joint study of contingency tables. Comput. Stat. Data Anal. 53(8), 3171–3182 (2009)
Acknowledgments
This work has been supported by the Basque Goverment under UPV/EHU research Grant EOPT (IT-567-13), BETS UPV/EHU Research and Teaching Unit (UFI) and Spanish Ministry of Economy and Competitiveness research Grant MTM 2012-31514. Thanks to our colleague F. Tusell-Palmer for proofreading of this document and to the anonymous referees whose comments have undoubtedly improved this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Fernández-Aguirre, K., Garín-Martín, M.A. & Modroño-Herrán, J.I. Visual displays: analytical study and applications to graphs and real data. Qual Quant 48, 2209–2224 (2014). https://doi.org/10.1007/s11135-013-9887-4
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11135-013-9887-4