Advertisement

Mining Multiple Clustering Data for Knowledge Discovery

  • Thanh Tho Quan
  • Siu Cheung Hui
  • Alvis Fong
Part of the Lecture Notes in Computer Science book series (LNCS, volume 2843)

Abstract

Clustering has been widely used for knowledge discovery. In this paper, we propose an effective approach known as Multi-Clustering to mine the data generated from different clustering methods for discovering relationships between clusters of data. In the proposed Multi-Clustering technique, it first generates combined vectors from the multiple clustering data. Then, the distances between the combined vectors are calculated using the Mahalanobis distance. The Agglomerative Hierarchical Clustering method is used to cluster the combined vectors. And finally, relationship vectors that can be used to identify the cluster relationships are generated. To illustrate the technique, we also discuss an application example that uses the proposed Multi-Clustering technique to mine the author clusters and document clusters for identifying the relationships on authors working on research areas. The performance of the proposed technique is also evaluated.

Keywords

Cluster Method Data Item Mahalanobis Distance Document Cluster Combine Vector 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Berkhin, P.: Survey of Clustering Data Mining Techniques. Technical Report. Accrue Soft-ware, Inc (2002)Google Scholar
  2. 2.
    Cios, K.J., Pedrycz, W., Swiniarski, R.W.: Data Mining: Methods for Knowledge Discovery. Kluwer Academic Publisher, Norwell (1998)zbMATHGoogle Scholar
  3. 3.
    Van Rijsbergen, C.: Information Retrieval. Utterworths, London (1979)Google Scholar
  4. 4.
    He, Y., Hui, S.C.: Mining aWeb Citation Database for Author Co-citation Analysis. Information Processing and Management 38(4), 491–508 (2002)zbMATHCrossRefGoogle Scholar
  5. 5.
    He, Y., Hui, S.C., Fong, A.C.M.: Mining a Web Citation Database for Document Clustering. Applied Artificial Intelligence 16(4), 283–302 (2002)CrossRefGoogle Scholar
  6. 6.
    Bohm, C., Berchtold, S.: Keim: Searching in High-Dimensional Spaces – Index structures for Improving the Performance of Multimedia Databases. ACM Computing Surveys 33(8), 322–373 (2001)CrossRefGoogle Scholar
  7. 7.
    Carkacioglu, A., Vural, F.Y.: Learning Similarity Space. In: International Conference on Image Processing, pp. 405–408 (2002)Google Scholar
  8. 8.
    Weinberg, S.: Applied linear regression. John Wiley and Sons, Chichester (1985)Google Scholar
  9. 9.
    Everitt, B.: Cluster Analysis, 3rd edn. Edward Arnold, London (1993)Google Scholar
  10. 10.
    Mitchell, T.M.: Machine Learning. McGraw Hill, United States (1997)zbMATHGoogle Scholar
  11. 11.
    Boley, D.: Principal Direction Divisive Partitioning. Data Mining and Knowledge Discovery 2(4), 325–344 (1998)CrossRefGoogle Scholar
  12. 12.
    Zamir, O., Etzioni, O.: Web Document Clustering: a Feasibility Demonstration. In: Proceeding of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 46–54 (1998)Google Scholar
  13. 13.
    Kohonen, T.: Self-Organizing Maps. Springer, Berlin (2001)zbMATHGoogle Scholar
  14. 14.
    Grossberg, S.: The Adaptive Self-Organization of Serial Order in Behavior: Speech, Language and Motor Control. In: Pattern Recognition By Humans and Machines, vol. I, Speech Perception. Academic Press Inc., London (1986)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2003

Authors and Affiliations

  • Thanh Tho Quan
    • 1
  • Siu Cheung Hui
    • 1
  • Alvis Fong
    • 1
  1. 1.School of Computer EngineeringNanyang Technological UniversitySingapore

Personalised recommendations