Scalable Dynamic Self-Organising Maps for Mining Massive Textual Data
Traditional text clustering methods require enormous computing resources, which make them inappropriate for processing large scale data collections. In this paper we present a clustering method based on the word category map approach using a two-level Growing Self-Organising Map (GSOM). A significant part of the clustering task is divided into separate sub-tasks that can be executed on different computers using the emergent Grid technology. Thus enabling the rapid analysis of information gathered globally. The performance of the proposed method is comparable to the traditional approaches while improves the execution time by 15 times.
KeywordsExecution Time News Article Spread Factor Grid Resource Cluster Task
Unable to display preview. Download preview PDF.
- 1.Honkela, T., Kaski, S., Lagus, K., Kohonen, T.: Newsgroup Exploration with WEBSOM Method and Browsing Interface, Tech. Rep. A32, Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland (1996)Google Scholar
- 3.Foster, I., Kesselman, C. (eds.): The grid: blueprint for a new computing infrastructure. Elsevier, Amsterdam (2004)Google Scholar
- 4.Kohonen, T.: Self-organizing maps. Springer, Berlin (1995)Google Scholar
- 5.Alahakoon., D., Halgamuge, S.K., Srinivasan, B.: Dynamic Self-Organising Maps with Controlled Growth for Knowledge Discovery. IEEE Transactions on Neural Networks, Special Issue on Knowledge Discovery and Data Mining 11(3) (2000)Google Scholar
- 7.Honkela, T.: Self-Organizing Maps in Natural Language Processing, Ph.D. thesis, Helsinki University of Technology, Neural Networks Research Center, Espoo, Finland (1997)Google Scholar
- 8.Nürnberger, A.: Interactive Text Retrieval Supported by Growing Self-Organizing Maps. In: Proc. of the International Workshop on Information Retrieval, pp. 61–70 (2001)Google Scholar
- 9.Larsen, B., Aone, C.: Fast and Effective Text Mining using Linear Time Document Clustering. In: Proceedings of the conference on Knowledge Discovery and Data Mining, pp. 16–22 (1999)Google Scholar
- 10.Depoutovitch, A., Wainstein, A.: Building Grid Enabled Data-Mining Applications (2005), http://www.ddj.com/184406345
- 13.Hsu, A., Tang, S., Halgamuge, S.K.: An Unsupervised Hierarchical Dynamic Self-Organising Approach to Class Discovery and Marker Gene Identification in Microarray Data. Oxford University Press, Oxford (2003)Google Scholar
- 14.Alahakoon, D.: Controlling the Spread of Dynamic Self Organising Maps. Neural Computing and Applications 13(2), 168–174 (2004)Google Scholar
- 15.Wickramasinghe, L.K., Alahakoon, L.D.: Dynamic Self Organizing Maps for Discovery and Sharing of Knowledge in Multi Agent Systems in Web Intelligence and Agent Systems: An International Journal 3(1) (2005)Google Scholar