Abstract
The forward search provides a powerful and computationally simple approach for the robust analysis of multivariate data. In this paper we suggest a new forward search algorithm for clustering multivariate categorical observations. Classification based on categorical information poses a number of challenging issues that are addressed by our algorithm. These include selection of the number of groups, identification of outliers and stability of the suggested solution. The performance of the algorithm is shown with both simulated and real examples.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Atkinson A.C., Riani M., Cerioli A.: Exploring Multivariate Data with the Forward Search. Springer, New York (2004)
Atkinson A.C., Riani M., Cerioli A.: Random start forward searches with envelopes for detecting clusters in multivariate data. In: Zani, S., Cerioli, A., Riani, M., Vichi, M. (eds.) Data Analysis, Classification and the Forward Search. Springer, Berlin (in press)
Cerioli A., Milioli M.A., Morlini I., Zani S.: L’ICT nella pubblica amministrazione: un’applicazione ai comuni dell’Emilia-Romagna (in Italian). In: Atti della Riunione Scientifica su Valutazione e Customer Satisfaction per la Qualità dei Servizi. Facoltà di Scienze Statistiche, Università di Roma, 65–68 (2005)
Chaturvedi A., Green P.E., Carroll J.D.: K-modes clustering. Journal of Classification, 18, 35–55 (2001)
Corbellini A., Konis K.: An R package for the robust analysis of multivariate data. In: Zani, S., Cerioli, A., Riani, M., Vichi, M. (eds.) Data Analysis, Classification and the Forward Search. Springer, Berlin (in press)
Cuesta-Albertos, J.A., Gordaliza A., Matran C.: Trimmed k-means: an attempt to robustify quantizers. Annals of Statistics, 25, 553–576 (1997)
Fraley C., Raftery A.E.: Model-based clustering, discriminant analysis, and density estimation. Journal of the American Statistical Association, 97, 611–631 (2002)
Friedman J.H., Meulman J.J.: Clustering objects on subsets of attributes. Journal of the Royal Statistical Society B, 66, 815–849 (2004)
Gordon A.D.: Classification, 2nd Ed. Chapman & Hall/CRC, Boca Raton (1999)
Hennig C.: Clusters, outliers, and regression: fixed point clusters. Journal of Multivariate Analysis, 86, 183–212 (2003)
Huang Z.: Extensions to the k-means algorithm for clustering large data sets with categorical values. Data Mining and Knowledge Discovery, 2, 283–304 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Physica-Verlag Heidelberg
About this paper
Cite this paper
Cerioli, A., Riani, M., Atkinson, A.C. (2006). Robust classification with categorical variables. In: Rizzi, A., Vichi, M. (eds) Compstat 2006 - Proceedings in Computational Statistics. Physica-Verlag HD. https://doi.org/10.1007/978-3-7908-1709-6_41
Download citation
DOI: https://doi.org/10.1007/978-3-7908-1709-6_41
Publisher Name: Physica-Verlag HD
Print ISBN: 978-3-7908-1708-9
Online ISBN: 978-3-7908-1709-6
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)