Challenges for Computational Intelligence pp 317-341 | Cite as
Knowledge-Based Clustering in Computational Intelligence
Summary
Clustering is commonly regarded as a synonym of unsupervised learning aimed at the discovery of structure in highly dimensional data. With the evident plethora of existing algorithms, the area offers an outstanding diversity of possible approaches along with their underlying features and potential applications. With the inclusion of fuzzy sets, fuzzy clustering became an integral component of Computational Intelligence (CI) and is now broadly exploited in fuzzy modeling, fuzzy control, pattern recognition, and exploratory data analysis. A lot of pursuits of CI are human-centric in the sense they are either initiated or driven by some domain knowledge or the results generated by the CI constructs are made easily interpretable. In this sense, to follow the tendency of human-centricity so profoundly visible in the CI domain, the very concept of fuzzy clustering needs to be carefully revisited. We propose a certain paradigm shift that brings us to the idea of knowledge-based clustering in which the development of information granules – fuzzy sets is governed by the use of data as well as domain knowledge supplied through an interaction with the developers, users and experts. In this study, we elaborate on the concepts and algorithms of knowledge-based clustering by considering the well known scheme of Fuzzy C-Means (FCM) and viewing it as an operational model using which a number of essential developments could be easily explained. The fundamental concepts discussed here involve clustering with domain knowledge articulated through partial supervision and proximity-based knowledge hints.
Keywords
Association Rule Fuzzy Cluster Membership Grade Information Granule Partition MatrixPreview
Unable to display preview. Download preview PDF.
References
- [1]Abonyi, J. and Szeifert, F. (2003). Supervised fuzzy clustering for the identification of fuzzy classifiers, Pattern Recognition Letters,24,14, 2195-2207.zbMATHCrossRefGoogle Scholar
- [2]Agarwal, R. and Srikant, R. (2000). Privacy-preserving data mining. In: Proc. of the ACM SIGMOD Conference on Management of Data. ACM Press, New York, May 2000, 439-450.CrossRefGoogle Scholar
- [3]Bensaid, A. M., Hall, L. O., Bezdek, J. C. and Clarke L. P. (1996). Partially supervised clustering for image segmentation, Pattern Recognition, 29,5,859-871.CrossRefGoogle Scholar
- [4]Bezdek, J. C. (1981). Pattern Recognition with Fuzzy Objective Function Algorithms, Plenum Press, NY.zbMATHGoogle Scholar
- [5]Claerhout, B. and DeMoor, G.J.E. (2005). Privacy protection for clinical and genomic data: The use of privacy-enhancing techniques in medicine, Int. Journal of Medical Informatics, 74, 2-4, 257-265.CrossRefGoogle Scholar
- [6]Clifton, C. (2000). Using sample size to limit exposure to data mining, Journal of Computer Security 8,4, 281-307.Google Scholar
- [7]Clifton, C. and Marks, D. (1996). Security and privacy implications of data mining. In: Workshop on Data Mining and Knowledge Discovery, Montreal, Canada, 15-19.Google Scholar
- [8]Clifton, C. and Thuraisingham, B. (2001). Emerging standards for data mining, Computer Standards & Interfaces, 23, 3, 187-193.CrossRefGoogle Scholar
- [9]Coppi, R. and D'Urso, P. (2003). Three-way fuzzy clustering models for LR fuzzy time trajectories, Computational Statistics & Data Analysis, 43,2,149-177.zbMATHMathSciNetGoogle Scholar
- [10]Da Silva, J. C., Giannella, C., Bhargava, R., Kargupta, H. and Klusch, M. (2005). Distributed data mining and agents, Engineering Applications of Artificial Intelligence, 18, 7, 791-807.CrossRefGoogle Scholar
- [11]Du, W., Zhan, Z. (2002). Building decision tree classifier on private data. In: Clifton, C., Estivill-Castro, V. (Eds.), IEEE ICDM Workshop on Privacy, Security and Data Mining, Conferences in Research and Practice in Information Technology, vol. 14, Maebashi City, Japan, ACS, pp. 1-8.Google Scholar
- [12]Evfimievski, A., Srikant, R., Agrawal, R. and Gehrke, J. (2004). Privacy preserving mining of association rules, Information Systems, 29, 4, 343-364.CrossRefGoogle Scholar
- [13]Johnsten, T. and Raghavan V.V. (2002). A methodology for hiding knowledge in databases. In: Clifton, C., Estivill-Castro, C. (Eds.), IEEE ICDM Workshop on Privacy, Security and Data Mining, Conferences in Research and Practice in Information Technology, vol. 14. Maebashi City, Japan, ACS, pp. 9-17.Google Scholar
- [14]Kargupta, H., Kun, L., Datta, S., Ryan, J. and Sivakumar, K. (2003). Homeland security and privacy sensitive data mining from multi-party distributed resources, Proc. 12th IEEE International Conference on Fuzzy Systems, FUZZ '03,. Volume 2, 25-28 May 2003, vol.2, 1257-1260.Google Scholar
- [15]Kersten, P.R. (1996). Including auxiliary information in fuzzy clustering, Proc. 1996 Biennial Conference of the North American Fuzzy Information Processing Society, NAFIPS, 19-22 June 1996, 221 -224.Google Scholar
- [16]Lindell, Y. and Pinkas, B. (2000). Privacy preserving data mining. In: Lecture Notes in Computer Science, vol. 1880, 36-54.Google Scholar
- [17]Liu, H. and Huang, S.T. (2003). Evolutionary semi-supervised fuzzy clustering, Pattern Recognition Letters, 24, 16, 3105-3113.CrossRefGoogle Scholar
- [18]Merugu, S and Ghosh, J. (2005).A privacy-sensitive approach to distributed clustering, Pattern Recognition Letters, 26, 4, 399-410.CrossRefGoogle Scholar
- [19]Park, B. and Kargupta, H. (2003). Distributed data mining: algorithms, systems, and applications. In: Ye, N. (Ed.), The Handbook of Data Mining. Lawrence Erlbaum Associates, N. York, 341-358.Google Scholar
- [20]Pedrycz, W. (1985). Algorithms of fuzzy clustering with partial supervision, Pattern Recognition Letters, 3, 1985, 13-20.CrossRefGoogle Scholar
- [21]Pedrycz, W. and Waletzky, J. (1997). Fuzzy clustering with partial supervision, IEEE Trans. on Systems, Man, and Cybernetics, 5, 787-795.Google Scholar
- [22]Pedrycz, W. and Waletzky, J. (1997). Neural network front-ends in unsupervised learning, IEEE Trans. on Neural Networks, 8, 390-401.CrossRefGoogle Scholar
- [23]Pedrycz, W., Loia, V. and Senatore, S. (2004). P-FCM: A proximity-based clustering, Fuzzy Sets & Systems, 148, 2004, 21-41.zbMATHCrossRefMathSciNetGoogle Scholar
- [24]Pedrycz, W. (2002). Collaborative fuzzy clustering, Pattern Recognition Letters, 23, 14, 1675-1686.zbMATHCrossRefGoogle Scholar
- [25]Pedrycz, W. (2005). Knowledge-Based Clustering: From Data to Information Granules, J. Wiley, N. York.zbMATHGoogle Scholar
- [26]Pinkas, B. (2002). Cryptographic techniques for privacy-preserving data mining. ACM SIGKDD Explorations Newsletter 4, 2, 12-19.CrossRefGoogle Scholar
- [27]Strehl, A. and Ghosh, J. (2002). Cluster ensembles—a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583-617.CrossRefMathSciNetGoogle Scholar
- [28]Timm, H., Klawonn, F. and Kruse, R. (2002). An extension of partially supervised fuzzy cluster analysis, Proc. Annual Meeting of the North American Fuzzy Information Processing Society, NAFIPS 2002, 27-29 June 2002, 63-68.Google Scholar
- [29]Tsoumakas, G., Angelis, L. and Vlahavas, I. (2004). Clustering classifiers for knowledge discovery from physically distributed databases, Data & Knowledge Engineering, 49, 3, 223-242.CrossRefGoogle Scholar
- [30]Verykios, V.S., Bertino, E., Fovino, I.N., Provenza, L.P., Saygin, Y. and Theodoridis Y. (2004). State-of-the-art in privacy preserving data mining. SIGMOD Record 33, 1, 50-57.CrossRefGoogle Scholar
- [31]Wang K., Yu, P.S. and Chakraborty, S. (2004). Bottom-up generalization: a data mining solution to privacy protection, Proc. 4 th IEEE International Conference on Data Mining, ICDM 2004, 1-4 Nov. 2004, 249-256Google Scholar
- [32]Wang, S.L. and Jafari, A. (2005). Using unknowns for hiding sensitive predictive association rules, Proc. 2005 IEEE International Conference on Information Reuse and Integration, 223-228.Google Scholar
- [33]Wang, E.T., Lee, G. and Lin, Y. T. (2005). A novel method for protect-ing sensitive knowledge in association rules mining, Proc. 29 th Annual International Computer Software and Applications Conference (COMP-SAC 2005), vol. 2, 511-516.Google Scholar
- [34]Zadeh, L. A. (2005). Toward a generalized theory of uncertainty (GTU) - an outline, Information Sciences, 172, 1-2, 1-40.zbMATHCrossRefMathSciNetGoogle Scholar