Journal of Classification

, Volume 18, Issue 1, pp 35-55

First online:

K-modes Clustering

  • Anil ChaturvediAffiliated withKraft Foods, GV529, 1 Kraft Court, Glennview, IL 60025, USA
  • , Paul E. GreenAffiliated withThe Wharton School, University of Pennsylvania
  • , J. Douglas CarollAffiliated withRutgers University, Graduate School of Management

Rent the article at a discount

Rent now

* Final gross prices may vary according to local VAT.

Get Access


We norm (defined as the limit of an Lp norm as p approaches zero).

In Monte Carlo simulations, both K-modes and the latent class procedures (e.g., Goodman 1974) performed with equal efficiency in recovering a known underlying cluster structure. However, K-modes is an order of magnitude faster than the latent class procedure in speed and suffers from fewer problems of local optima than do the latent class procedures. For data sets involving a large number of categorical variables, latent class procedures become computationally extremly slow and hence infeasible.

We conjecture that, although in some cases latent class procedures might perform better than K-modes, it could out-perform latent class procedures in other cases. Hence, we recommend that these two approaches be used as "complementary" procedures in performing cluster analysis. We also present an empirical comparison of K-modes and latent class, where the former method prevails.