Multilevel Clustering of Induction Rules for Web Meta-knowledge
The current World Wide Web is featured by a huge mass of knowledge, making it difficult to exploit. One possible way to cope with this issue is to proceed to knowledge mining in a way that we could control its volume and hence make it manageable. This paper explores meta-knowledge discovery and in particular focuses on clustering induction rules for large knowledge sets. Such knowledge representation is considered for its expressive power and hence its wide use. Adapted data mining is proposed to extract meta-knowledge taking into account the knowledge representation which is more complex than simple data. Besides, a new clustering approach based on multilevel paradigm and called multilevel clustering is developed for the purpose of treating large scale knowledge sets. The approach invokes the k-means algorithm to cluster induction rules using new designed similarity measures. The developed algorithms have been implemented on four public benchmarks to test the effectiveness of the multilevel clustering approach. The numerical results have been compared to those of the simple k-means algorithm. As foreseeable, the multilevel clustering outperforms clearly the basic k-means on both the execution time and success rate that remains constant to 100 % while increasing the number of induction rules.
KeywordsKnowledge mining meta-knowledge multilevel paradigm k-means k-nearest neighbors induction rules genetic algorithm
Unable to display preview. Download preview PDF.
- 1.Han, J., Kamber, M., Pei, J.: Data Mining: Concepts and Techniques, 3rd edn. Elsevier (2011)Google Scholar
- 3.Kaufman, K.A., Michalski, R.S.: From Data Mining to Knowledge Mining. In: Rao, C.R., Solka, J.L., Wegman, E.J. (eds.) Handbook in Statistics. Data Mining and Data Visualization, vol. 24, pp. 47–75. Elsevier/North Holland (2005)Google Scholar
- 4.Michalski, R.S.: Knowledge mining: A proposed new direction, School of Computational Sciences George Mason University and Institute for Computer Science Polish Academy of Sciences (2003)Google Scholar
- 5.Saneifar, H., Bringay, S., Laurent, A.: S2MP: Similarity Measure for Sequential Patterns. In: Proceeding of the 7th Australian Data Mining Conference AusDM 2008, Adelaide, Australia, November 27-28, pp. 95–104 (2008)Google Scholar
- 6.Tuomi, I.: Data is More Than Knowledge Implications of the Reversed Knowledge Hierarchy for Knowledge Management and Organizational Memory. Journal of Management Information Systems 16(3), 107–121 (fall 1999)Google Scholar
- 7.Drias, H., Aouichat, A., Boutorh, A.: Towards Incremental Knowledge Warehousing and Mining. In: DCAI 2012, pp. 501–510 (2012)Google Scholar
- 9.Hendrickson, B., Leland, R.: A multilevel algorithm for partitioning graphs. In: Proceedings of the Supercomputing 1995 (1995)Google Scholar
- 11.Dhillon, S., Guan, Y., Kulis, B.: Weighted Graph Cuts without Eigenvectors: A Multilevel Approach. IEEE Transactions on Patterns Analysis and Machine Intelligence 29(11) (2007)Google Scholar
- 12.Korosec, P., Silc, J., Robic, B.: A Multi-level Ant-Colony-Optimization: Algorithm for MESH Partitioning, Computer Systems Department, Jozef Stefan Institute, Ljubljana, Slovenia. IEEE 2003 Conference Publication (2003)Google Scholar
- 13.Bouhmala, N.: A Multilevel Approach Applied to Sat-Encoded Problems, Vestfold University College Norway. VLSI Design (2012) ISBN: 978-953-307-884-7Google Scholar
- 15.Poongothai, K., Sathiyabama, S.: Efficient Web Usage Miner Using Decisive Induction Rules. Journal of Computer Science 8(6), 835–840 (2012)Google Scholar