Correspondence Analysis in the Case of Outliers
Analysis of categorical data by means of Correspondence Analysis (CA) has recently become popular. The behavior of CA in the presence of outliers in the table is not sufficiently explored in the literature, especially in the case of multidimensional contingency tables. In our research we apply correspondence analysis to three-way contingency tables with outliers, generated by deviations from the independence model. Outliers in our work are chosen in such a way that they break the independence in the table, but still they are not large enough to be easily spotted without statistical analysis. We study the change in the correspondence analysis row and column coordinates caused by the outliers and perform numerical analysis of the outlier coordinates.
KeywordsContingency Table Correspondence Analysis Marginal Probability Independence Model Moderate Outlier
We appreciate valuable comments of Mikhail Langovoy as well as of anonymous reviewers of our article.
- Agresti, A. (2002). Categorical data analysis. Hoboken: Wiley.Google Scholar
- Andersen, E. B. (1994). The statistical analysis of categorical data. Berlin: Springer.Google Scholar
- Barnett, V., & Lewis, T. (1984). Outliers in statistical data (Wiley Series in Probability and Mathematical Statistics. Applied Probability and Statistics 2nd ed.). Chichester: Wiley.Google Scholar
- Blasius, J. (2001). Korrespondenzanalyse. München: Oldenbourg Verlag.Google Scholar
- Blasius, J., & Greenacre, M. (2006). Multiple correspondence analysis and related methods. London: Chapman and Hall.Google Scholar
- Greenacre, M. J. (1984). Theory and applications of correspondence analysis. London: Academic.Google Scholar
- Kroonenberg, P. M. (2007). Applied multiway data analysis. Hoboken: Wiley.Google Scholar
- Kuhnt S. (2004). Outlier identification procedures for contingency tables using maximum likelihood and L 1 estimates. Scandinavian Journal of Statistics,31, 431–442.Google Scholar
- Nenadić, O. & Greenacre, M. (2007). Correspondence analysis in R, with two- and three-dimensional graphics: The ca package. Journal of Statistical Software,20(3), 1–13.Google Scholar
- Shane, K. V., & Simonoff, J. S. (2001). A robust approach to categorical data analysis. ComputGraphStat,10, 135–157.Google Scholar