Advertisement

Scalable Dynamic Self-Organising Maps for Mining Massive Textual Data

  • Yu Zheng Zhai
  • Arthur Hsu
  • Saman K. Halgamuge
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4234)

Abstract

Traditional text clustering methods require enormous computing resources, which make them inappropriate for processing large scale data collections. In this paper we present a clustering method based on the word category map approach using a two-level Growing Self-Organising Map (GSOM). A significant part of the clustering task is divided into separate sub-tasks that can be executed on different computers using the emergent Grid technology. Thus enabling the rapid analysis of information gathered globally. The performance of the proposed method is comparable to the traditional approaches while improves the execution time by 15 times.

Keywords

Execution Time News Article Spread Factor Grid Resource Cluster Task 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Honkela, T., Kaski, S., Lagus, K., Kohonen, T.: Newsgroup Exploration with WEBSOM Method and Browsing Interface, Tech. Rep. A32, Helsinki University of Technology, Laboratory of Computer and Information Science, Espoo, Finland (1996)Google Scholar
  2. 2.
    Kaski, S., Honkela, T., Lagus, K., Kohonen, T.: WEBSOM—Self-Organizing Maps of Document Collections. Neurocomputing 21, 101–117 (1998)CrossRefMATHGoogle Scholar
  3. 3.
    Foster, I., Kesselman, C. (eds.): The grid: blueprint for a new computing infrastructure. Elsevier, Amsterdam (2004)Google Scholar
  4. 4.
    Kohonen, T.: Self-organizing maps. Springer, Berlin (1995)Google Scholar
  5. 5.
    Alahakoon., D., Halgamuge, S.K., Srinivasan, B.: Dynamic Self-Organising Maps with Controlled Growth for Knowledge Discovery. IEEE Transactions on Neural Networks, Special Issue on Knowledge Discovery and Data Mining 11(3) (2000)Google Scholar
  6. 6.
    Lagus, K., Kaski, S., Kohonen, T.: Mining Massive Document Collections by the WEBSOM Method. Information Sciences 163(1-3), 135–156 (2004)CrossRefGoogle Scholar
  7. 7.
    Honkela, T.: Self-Organizing Maps in Natural Language Processing, Ph.D. thesis, Helsinki University of Technology, Neural Networks Research Center, Espoo, Finland (1997)Google Scholar
  8. 8.
    Nürnberger, A.: Interactive Text Retrieval Supported by Growing Self-Organizing Maps. In: Proc. of the International Workshop on Information Retrieval, pp. 61–70 (2001)Google Scholar
  9. 9.
    Larsen, B., Aone, C.: Fast and Effective Text Mining using Linear Time Document Clustering. In: Proceedings of the conference on Knowledge Discovery and Data Mining, pp. 16–22 (1999)Google Scholar
  10. 10.
    Depoutovitch, A., Wainstein, A.: Building Grid Enabled Data-Mining Applications (2005), http://www.ddj.com/184406345
  11. 11.
    Salton, G.: Developments in Automatic Text Retrieval. Science 253, 974–979 (1991)CrossRefMathSciNetGoogle Scholar
  12. 12.
    Hsu, A., Halgamuge, S.K.: Enhancement of Topology Preservation and Hierarchical Dynamic Self-Rrganising Maps for Data Visualisation. International Journal of Approximate Reasoning 32(2-3), 259–279 (2003)CrossRefMATHGoogle Scholar
  13. 13.
    Hsu, A., Tang, S., Halgamuge, S.K.: An Unsupervised Hierarchical Dynamic Self-Organising Approach to Class Discovery and Marker Gene Identification in Microarray Data. Oxford University Press, Oxford (2003)Google Scholar
  14. 14.
    Alahakoon, D.: Controlling the Spread of Dynamic Self Organising Maps. Neural Computing and Applications 13(2), 168–174 (2004)Google Scholar
  15. 15.
    Wickramasinghe, L.K., Alahakoon, L.D.: Dynamic Self Organizing Maps for Discovery and Sharing of Knowledge in Multi Agent Systems in Web Intelligence and Agent Systems: An International Journal 3(1) (2005)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Yu Zheng Zhai
    • 1
  • Arthur Hsu
    • 1
  • Saman K. Halgamuge
    • 1
  1. 1.Dynamic System and Control Group, Department of Mechanical and Manufacturing EngineeringUniversity of MelbourneVictoriaAustralia

Personalised recommendations