Outlier detection in contingency tables based on minimal patterns
- 334 Downloads
A new technique for the detection of outliers in contingency tables is introduced, where outliers are unusual cell counts with respect to classical loglinear Poisson models. Subsets of cell counts called minimal patterns are defined, corresponding to non-singular design matrices and leading to potentially uncontaminated maximum-likelihood estimates of the model parameters and thereby the expected cell counts. A criterion to easily produce minimal patterns in the two-way case under independence is derived, based on the analysis of the positions of the chosen cells. A simulation study and a couple of real-data examples are presented to illustrate the performance of the newly developed outlier identification algorithm, and to compare it with other existing methods.
KeywordsContingency tables Robustness Loglinear models Outliers Minimal patterns
FR is partially supported by the Italian Ministry for University and Research, programme PRIN2009, grant number 2009H8WPX5.
- Glass, D.V., Berent, J.: Social Mobility in Britain. International Library of Sociology and Social Reconstruction. Routledge & Kegan Paul, London (1954) Google Scholar
- Goodman, L.A.: A simple simultaneous test procedure for quasi-independence in contingency tables. J. R. Stat. Soc., Ser. C 20(2), 165–177 (1971) Google Scholar
- Kuhnt, S.: Ausreißeridentifikation im Loglinearen Poissonmodell für Kontingenztafeln unter Einbeziehung robuster Schätzer. Ph.D. thesis, Universität Dortmund, Dortmund (2000) Google Scholar
- Mosteller, F., Parunak, A.: Identifying extreme cells in a sizable contingency table: probabilistic and exploratory approaches. In: Hoaglin, D.C., Mosteller, F., Tukey, J.W. (eds.) Exploring Data Tables, Trends, and Shapes, pp. 189–224. Wiley, New York (2006) Google Scholar
- R Development Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2012) Google Scholar
- von Eye, A.: Configural Frequency Analysis: Methods, Models, and Applications. Lawrence Erlbaum Associates, Mahwah (2002) Google Scholar