Advertisement

Impact of Mixed Metrics on Clustering

  • Karina Gibert
  • Ramon Nonell
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2905)

Abstract

One of the features involved in clustering is the evaluation of distances between individuals. This paper is related with the use of mixed metrics for clustering messy data. Indeed, when facing complex real domains it becomes natural to deal simultaneously with numerical and symbolic attributes. This can be treated on different approaches. Here, the use of mixed metrics is followed.

In the paper, a family of mixed metrics introduced by Gibert is used with different parameters on an experimental data set, in order to assess the impact on final classes.

Keywords

clustering metrics qualitative and quantitative variables messy data ill-structured domains 

References

  1. 1.
    Anderberg, M.R.: Cluster Analysis for aplications. Academic Press, London (1973)Google Scholar
  2. 2.
    Diday, E., Moreau, J.V.: Learning hierarchical clustering from examples. In: Rapport N 289 Centre de Rocquencourt (editor). INRIA (1984)Google Scholar
  3. 3.
    Dillon, W.R., et al.: Multivariate analysis. Methods & applications. Wiley, Chichester (1984)zbMATHGoogle Scholar
  4. 4.
    Gibert, K.: Klass. Estudi d’un sistema d’ajuda al tractament estadístic de grans bases de dades. Master’s thesis, UPC (1991)Google Scholar
  5. 5.
    Gibert, K.: L’us de la Informació Simbòlica en l’Automatització del Tractament Estadístic de Dominis Poc Estructurats. phd. thesis., UPC, Barcelona, Spain (1994)Google Scholar
  6. 6.
    Gibert, K.: The use of symbolic information in automation of statistical treatment for ill-structured domains. AI Communications 9(1), 36–37 (1996)Google Scholar
  7. 7.
    Gibert, K., Annicchiarico, R., et al.: Kdd on functional disabilities using clustering based on rules on who-das ii. In: ITI 2003, Croatia, pp. 181–186 (2003)Google Scholar
  8. 8.
    Gibert, K., Cortés, U.: KLASS: Una herramienta estadística para... poco estructurados. In: Noriega (ed.) Proc. IBERAMIA 1992, México, pp. 483–497 (1992)Google Scholar
  9. 9.
    Gibert, K., Cortés, U.: Combining a knowledge-based system and a clustering method. LNS, vol. 89, pp. 351–360. Springer, Heidelberg (1994)Google Scholar
  10. 10.
    Gibert, K., Cortés, U.: Weighing quantitative and qualitative variables in clustering methods. Mathware and Soft Computing 4(3), 251–266 (1997)Google Scholar
  11. 11.
    Gibert, K., Cortés, U.: Clustering based on rules and knowledge discovery in ill-structured domains. Computación y Sistemas. 1(4), 213–227 (1998)Google Scholar
  12. 12.
    Gibert, K., Sonicki, Z.: Classification Based on Rules and Thyroids Dysfunctions. Applied Stochastic Models in Business and Industry 15(4), 319–324 (1999)zbMATHCrossRefGoogle Scholar
  13. 13.
    Chidananda Gowda, K., Diday, E.: Symbolic clustering using a new similarity measure. IEEE Tr SMC 22(2) (March/April 1991)Google Scholar
  14. 14.
    Gower, J.C.: A General coefficient if similarity. Biometrics 27, 857–874 (1971)CrossRefGoogle Scholar
  15. 15.
    Ichino, M., Yaguchi, H.: Generalized Minkowski Metrics for Mixed feature-type data analysis. IEEE Tr SMC 22(2), 146–153 (1994)Google Scholar
  16. 16.
    Kaufman, L., Rousseeuw, P.J.: Finding Groups in Data, an Introduction to Cluster Analysis. John Wiley & Sons, London (1990)Google Scholar
  17. 17.
    Lebart, L.: Traitement statistique des données. Dunod, Paris (1990)Google Scholar
  18. 18.
    Nakhaeizadeh, G.: Classification as a subtask of of Data Mining experiences form some industrial projects. In: IFCS, Kobe, JAPAN, vol. 1, pp. 17–20 (1996)Google Scholar
  19. 19.
    R-Schulcoper, J.: Data analysis between sets of objects. In: ICSRIC 1996, vol. vIII, pp. 81–85 (1996)Google Scholar
  20. 20.
    Ralambondrainy, H.: A clustering method for nominal data and mixture. H.H.Bock, Elsevier Science Publishers, B.V, North-Holland (1988)Google Scholar
  21. 21.
    Ralambondrainy, H.: A conceptual version of the K-means algorithm. Lifetime Learning Publications, Belmont (1995)Google Scholar
  22. 22.
    Roux, M.: Algorithmes de classification. Masson, Paris (1985)Google Scholar
  23. 23.
    Shortlife, E.H.: MYCIN: A rule-based computer program for advising physicians regarding antimicrobial therapy selection. PhD thesis, StandfordGoogle Scholar
  24. 24.
    Volle, M.: Analyse des données, Ed. Economica, Paris, France (1985)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Karina Gibert
    • 1
  • Ramon Nonell
    • 1
  1. 1.Department of Statistics and Operation ResearchUniversitat Politècnica de CatalunyaBarcelonaSPAIN

Personalised recommendations