Hybrid k-Means: Combining Regression-Wise and Centroid-Based Criteria for QSAR

  • Robert Stanforth
  • Evgueni Kolossov
  • Boris Mirkin

Abstract

This paper further extends the ‘kernel’-based approach to clustering proposed by E. Diday in early 70s. According to this approach, a cluster’s centroid can be represented by parameters of any analytical model, such as linear regression equation, built over the cluster. We address the problem of producing regression-wise clusters to be separated in the input variable space by building a hybrid clustering criterion that combines the regression-wise clustering criterion with the conventional centroid-based one.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. DIDAY, E. (1974): Optimization in non-hierarchical clustering. Pattern Recognition 6(1), 17–33.CrossRefGoogle Scholar
  2. DIDAY, E., CELEUX, G., GOVAERT, G., LECHEVALLIER, Y., and RALAMBONDRAINY, H. (1989): Classification Automatique des Données. Dunod, Paris.Google Scholar
  3. MIRKIN, B. (2005): Clustering for Data Mining: A Data Recovery Approach. Chapman & Hall/CRC, Boca Raton, Fl.MATHGoogle Scholar
  4. TABACHNICK, B.G. and FIDELL, L.S. (2006): Using Multivariate Statistics (5th Edition). Allyn & Bacon, Boston.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Robert Stanforth
    • 1
    • 2
  • Evgueni Kolossov
    • 1
  • Boris Mirkin
    • 2
  1. 1.ID Business SolutionsGuildfordUK
  2. 2.School of Computer Science, BirkbeckUniversity of LondonLondonUK

Personalised recommendations