Abstract
Most of the clustering algorithms are designed for such datasets where the dissimilarity between any two points of the dataset can be computed using standard distance measures such as the Euclidean distance. However, many real-life datasets are categorical in nature, where no natural ordering can be found among the elements in the attribute domain. In such situations, the clustering algorithms, such as K-means [238] and fuzzy C-means (FCM) [62], cannot be applied. The K-means algorithm computes the center of a cluster by computing the mean of the set of feature vectors belonging to that cluster. However, as categorical datasets do not have any inherent distance measure, computing the mean of a set of feature vectors is meaningless. A variation of the K-means algorithm, namely Partitioning Around Medoids (PAM) or K-medoids [243], has been proposed for such kinds of datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Maulik, U., Bandyopadhyay, S., Mukhopadhyay, A. (2011). Clustering Categorical Data in a Multiobjective Framework. In: Multiobjective Genetic Algorithms for Clustering. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-16615-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-16615-0_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-16614-3
Online ISBN: 978-3-642-16615-0
eBook Packages: Computer ScienceComputer Science (R0)