Advertisement

Knowledge-Based Clustering in Computational Intelligence

  • Witold Pedrycz
Part of the Studies in Computational Intelligence book series (SCI, volume 63)

Summary

Clustering is commonly regarded as a synonym of unsupervised learning aimed at the discovery of structure in highly dimensional data. With the evident plethora of existing algorithms, the area offers an outstanding diversity of possible approaches along with their underlying features and potential applications. With the inclusion of fuzzy sets, fuzzy clustering became an integral component of Computational Intelligence (CI) and is now broadly exploited in fuzzy modeling, fuzzy control, pattern recognition, and exploratory data analysis. A lot of pursuits of CI are human-centric in the sense they are either initiated or driven by some domain knowledge or the results generated by the CI constructs are made easily interpretable. In this sense, to follow the tendency of human-centricity so profoundly visible in the CI domain, the very concept of fuzzy clustering needs to be carefully revisited. We propose a certain paradigm shift that brings us to the idea of knowledge-based clustering in which the development of information granules – fuzzy sets is governed by the use of data as well as domain knowledge supplied through an interaction with the developers, users and experts. In this study, we elaborate on the concepts and algorithms of knowledge-based clustering by considering the well known scheme of Fuzzy C-Means (FCM) and viewing it as an operational model using which a number of essential developments could be easily explained. The fundamental concepts discussed here involve clustering with domain knowledge articulated through partial supervision and proximity-based knowledge hints.

Keywords

Association Rule Fuzzy Cluster Membership Grade Information Granule Partition Matrix 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. [1]
    Abonyi, J. and Szeifert, F. (2003). Supervised fuzzy clustering for the identification of fuzzy classifiers, Pattern Recognition Letters,24,14, 2195-2207.zbMATHCrossRefGoogle Scholar
  2. [2]
    Agarwal, R. and Srikant, R. (2000). Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Conference on Management of Data. ACM Press, New York, May 2000, 439-450.CrossRefGoogle Scholar
  3. [3]
    Bensaid, A. M., Hall, L. O., Bezdek, J. C. and Clarke L. P. (1996). Partially supervised clustering for image segmentation, Pattern Recognition, 29,5,859-871.CrossRefGoogle Scholar
  4. [4]
    Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY.zbMATHGoogle Scholar
  5. [5]
    Claerhout, B. and DeMoor, G.J.E. (2005). Privacy protection for clinical and genomic data: The use of privacy-enhancing techniques in medicine, Int. Journal of Medical Informatics, 74, 2-4, 257-265.CrossRefGoogle Scholar
  6. [6]
    Clifton, C. (2000). Using sample size to limit exposure to data mining, Journal of Computer Security 8,4, 281-307.Google Scholar
  7. [7]
    Clifton, C. and Marks, D. (1996). Security and privacy implications of data mining. In: Workshop on Data Mining and Knowledge Discovery, Montreal, Canada, 15-19.Google Scholar
  8. [8]
    Clifton, C. and Thuraisingham, B. (2001). Emerging standards for data mining, Computer Standards & Interfaces, 23, 3, 187-193.CrossRefGoogle Scholar
  9. [9]
    Coppi, R. and D'Urso, P. (2003). Three-way fuzzy clustering models for LR fuzzy time trajectories, Computational Statistics & Data Analysis, 43,2,149-177.zbMATHMathSciNetGoogle Scholar
  10. [10]
    Da Silva, J. C., Giannella, C., Bhargava, R., Kargupta, H. and Klusch, M. (2005). Distributed data mining and agents, Engineering Applications of Artificial Intelligence, 18, 7, 791-807.CrossRefGoogle Scholar
  11. [11]
    Du, W., Zhan, Z. (2002). Building decision tree classifier on private data. In: Clifton, C., Estivill-Castro, V. (Eds.), IEEE ICDM Workshop on Privacy, Security and Data Mining, Conferences in Research and Practice in Information Technology, vol. 14, Maebashi City, Japan, ACS, pp. 1-8.Google Scholar
  12. [12]
    Evfimievski, A., Srikant, R., Agrawal, R. and Gehrke, J. (2004). Privacy preserving mining of association rules, Information Systems, 29, 4, 343-364.CrossRefGoogle Scholar
  13. [13]
    Johnsten, T. and Raghavan V.V. (2002). A methodology for hiding knowledge in databases. In: Clifton, C., Estivill-Castro, C. (Eds.), IEEE ICDM Workshop on Privacy, Security and Data Mining, Conferences in Research and Practice in Information Technology, vol. 14. Maebashi City, Japan, ACS, pp. 9-17.Google Scholar
  14. [14]
    Kargupta, H., Kun, L., Datta, S., Ryan, J. and Sivakumar, K. (2003). Homeland security and privacy sensitive data mining from multi-party distributed resources, Proc. 12th IEEE International Conference on Fuzzy Systems, FUZZ '03,. Volume 2, 25-28 May 2003, vol.2, 1257-1260.Google Scholar
  15. [15]
    Kersten, P.R. (1996). Including auxiliary information in fuzzy clustering, Proc. 1996 Biennial Conference of the North American Fuzzy Information Processing Society, NAFIPS, 19-22 June 1996, 221 -224.Google Scholar
  16. [16]
    Lindell, Y. and Pinkas, B. (2000). Privacy preserving data mining. In: Lecture Notes in Computer Science, vol. 1880, 36-54.Google Scholar
  17. [17]
    Liu, H. and Huang, S.T. (2003). Evolutionary semi-supervised fuzzy clustering, Pattern Recognition Letters, 24, 16, 3105-3113.CrossRefGoogle Scholar
  18. [18]
    Merugu, S and Ghosh, J. (2005).A privacy-sensitive approach to distributed clustering, Pattern Recognition Letters, 26, 4, 399-410.CrossRefGoogle Scholar
  19. [19]
    Park, B. and Kargupta, H. (2003). Distributed data mining: algorithms, systems, and applications. In: Ye, N. (Ed.), The Handbook of Data Mining. Lawrence Erlbaum Associates, N. York, 341-358.Google Scholar
  20. [20]
    Pedrycz, W. (1985). Algorithms of fuzzy clustering with partial supervision, Pattern Recognition Letters, 3, 1985, 13-20.CrossRefGoogle Scholar
  21. [21]
    Pedrycz, W. and Waletzky, J. (1997). Fuzzy clustering with partial supervision, IEEE Trans. on Systems, Man, and Cybernetics, 5, 787-795.Google Scholar
  22. [22]
    Pedrycz, W. and Waletzky, J. (1997). Neural network front-ends in unsupervised learning, IEEE Trans. on Neural Networks, 8, 390-401.CrossRefGoogle Scholar
  23. [23]
    Pedrycz, W., Loia, V. and Senatore, S. (2004). P-FCM: A proximity-based clustering, Fuzzy Sets & Systems, 148, 2004, 21-41.zbMATHCrossRefMathSciNetGoogle Scholar
  24. [24]
    Pedrycz, W. (2002). Collaborative fuzzy clustering, Pattern Recognition Letters, 23, 14, 1675-1686.zbMATHCrossRefGoogle Scholar
  25. [25]
    Pedrycz, W. (2005). Knowledge-Based Clustering: From Data to Information Granules, J. Wiley, N. York.zbMATHGoogle Scholar
  26. [26]
    Pinkas, B. (2002). Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4, 2, 12-19.CrossRefGoogle Scholar
  27. [27]
    Strehl, A. and Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583-617.CrossRefMathSciNetGoogle Scholar
  28. [28]
    Timm, H., Klawonn, F. and Kruse, R. (2002). An extension of partially supervised fuzzy cluster analysis, Proc. Annual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2002, 27-29 June 2002, 63-68.Google Scholar
  29. [29]
    Tsoumakas, G., Angelis, L. and Vlahavas, I. (2004). Clustering classifiers for knowledge discovery from physically distributed databases, Data & Knowledge Engineering, 49, 3, 223-242.CrossRefGoogle Scholar
  30. [30]
    Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y. and Theodoridis Y. (2004). State-of-the-art in privacy preserving data mining. SIGMOD Record 33, 1, 50-57.CrossRefGoogle Scholar
  31. [31]
    Wang K., Yu, P.S. and Chakraborty, S. (2004). Bottom-up generalization: a data mining solution to privacy protection, Proc. 4 th IEEE International Conference on Data Mining, ICDM 2004, 1-4 Nov. 2004, 249-256Google Scholar
  32. [32]
    Wang, S.L. and Jafari, A. (2005). Using unknowns for hiding sensitive predictive association rules, Proc. 2005 IEEE International Conference on Information Reuse and Integration, 223-228.Google Scholar
  33. [33]
    Wang, E.T., Lee, G. and Lin, Y. T. (2005). A novel method for protect-ing sensitive knowledge in association rules mining, Proc. 29 th Annual International Computer Software and Applications Conference (COMP-SAC 2005), vol. 2, 511-516.Google Scholar
  34. [34]
    Zadeh, L. A. (2005). Toward a generalized theory of uncertainty (GTU) - an outline, Information Sciences, 172, 1-2, 1-40.zbMATHCrossRefMathSciNetGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2007

Authors and Affiliations

  • Witold Pedrycz
    • 1
  1. 1.Department of Electrical & Computer EngineeringUniversity of AlbertaEdmontonCanada

Personalised recommendations