A New Evolving Tree for Text Document Clustering and Visualization

  • Wui Lee Chang
  • Kai Meng Tay
  • Chee Peng Lim
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 223)


The Self-Organizing Map (SOM) is a popular neural network model for clustering and visualization problems. However, it suffers from two major limitations, viz., (1) it does not support online learning; and (2) the map size has to be pre-determined and this can potentially lead to many “trial-and-error” runs before arriving at an optimal map size. Thus, an evolving model, i.e., the Evolving Tree (ETree), is used as an alternative to the SOM for undertaking a text document clustering problem in this study. ETree forms a hierarchical (tree) structure in which nodes are allowed to grow, and each leaf node represents a cluster of documents. An experimental study using articles from a flagship conference of Universiti Malaysia Sarawak (UNIMAS), i.e., the Engineering Conference (ENCON), is conducted. The experimental results are analyzed and discussed, and the outcome shows a new application of ETree in text document clustering and visualization.


Evolving tree text document clustering online learning 


  1. 1.
    Rui, X., Donald, C.W.: Clustering. In: IEEE Series on Computational Intelligence. Wiley, Hoboken (2009)Google Scholar
  2. 2.
    Kohonen, T.: Self-organizing maps, 3rd edn. Springer, Berlin (2001)CrossRefzbMATHGoogle Scholar
  3. 3.
    Vesanto, J., Alhoniemi, E.: Clustering of the self-organizing map. IEEE Trans. Neural Netw. 11(3), 586–600 (2000)Google Scholar
  4. 4.
    Chang, W.C., Luo, J., Kelvin, J.P.: Image segmentation via adaptive K-mean clustering and knowledge-based morphological operations with biomedical applications. IEEE Trans. Image Process. 7(12), 336–344 (1998)Google Scholar
  5. 5.
    Rezaee, R., Lelieveldt, B.P.F., Reiber, J., H., C.: A new cluster validity index for the Fuzzy c-Mean. Pattern Recogn. Lett. 19(3–4), 237–246 (1998)Google Scholar
  6. 6.
    James, C.B., Robert, E., William, F.: FCM: The fuzzy C-means clustering algorithm. Comput. Geosci. 10(2–3), 191–203 (1984)Google Scholar
  7. 7.
    Kohonen, T., Kaski, S., Lagus, K., Salojärvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Trans. Neural Netw. 11(3), 574–585 (2000)CrossRefGoogle Scholar
  8. 8.
    Kohonen, T., Simula, O., Visa, A.: Engineering applications of the self-organizing map. Proc. IEEE 84(10), 1358–1384 (1996)Google Scholar
  9. 9.
    Kohonen, T., Somervuo, P.: Self-organizing maps of symbol strings. Neurocomputing 21, 19–30 (1998)CrossRefzbMATHGoogle Scholar
  10. 10.
    Mao, J., Jain, A.K.: Artificial neural networks for feature extraction and multivariate data projection. IEEE Trans. Neural Netw. 6(2), 296–317 (1995)CrossRefGoogle Scholar
  11. 11.
    Buessler, J.L., Kara, R., Wira, P., Kihl, H., Urban, J.P.: Multiple self-organizing maps to facilitate the learning of visuo-motor correlations. In: IEEE International Conference on Systems, Man, and. Cybernetics 3, pp. 470–475 (1999)Google Scholar
  12. 12.
    Kohonen, T., Raivio, K., Simula, O., Henriksson, J.: Start-up behaviour of a neural network assisted decision feedback equalizer in a two-path channel. In: IEEE International Conference on, Communications, ICC92, vol. 3, pp. 1523–1527 (1992)Google Scholar
  13. 13.
    Lagus, K., Kaski, S., Kohonen, T.: Mining massive document collections by the WEBSOM method. Inf. Sci. 163(1–3), 135–156 (2003)Google Scholar
  14. 14.
    Chung, C.H., Shu, H.L., Wei, S.T.: Apply extended self-organizing map to cluster and classify mixed-type data. Neurocomputing 74, 3832–3842 (2011)CrossRefGoogle Scholar
  15. 15.
    Tai, W.S., Hsu, C.C., Chen, J.C.: A mixed-type self-organizing map with a dynamic structure. In: The 2010 International Joint Conference On Neural Networks (IJCNN), pp. 1–8 (2010)Google Scholar
  16. 16.
    Matharage, S., Alahakoon, D., Rajapakse, J., Pin, H.: Fast growing self organizing map for text clustering. In: Lecture Notes in Computer Science, Neural Information Processing, Vol. 7063/2011, pp. 406–415 (2011)Google Scholar
  17. 17.
    Kuo, R.J., Wang, C.F., Chen, Z.Y.: Integration of growing self-organizing map and continuous genetic algorithm for grading lithium-ion battery cells. Appl. Soft Comput. 12, 2012–2022 (2012)CrossRefGoogle Scholar
  18. 18.
    Huang, S.Y., Tsaih, R.H.: The prediction approach with growing hierarchical self-organizing map. In: The 2012 International Joint Conference On Neural Networks (IJCNN), pp. 1–7 (2012)Google Scholar
  19. 19.
    Pakkanen, J., Iivarinen, J., Oja, E.: The evolving tree—analysis and applications. IEEE Trans. Neural Netw. 17(3), 591–603 (2006)CrossRefGoogle Scholar
  20. 20.
    Fabrizio, S.: Text categorization. Alessandro, Z. (ed.) Text Mining and its Applications, pp. 109–129. WIT Press, Southampton (2005)Google Scholar
  21. 21.
    Lewis, D.: Naïve Bayes at forty: The independence assumption in information retrieval. Lect. Notes Compu. Sci. 1398, 4–15 (1998)Google Scholar
  22. 22.
    Azcarraga, A.P., Yap, T.J., Tan, J., Chua, T.S.: Evaluating keyword selection methods for WEBSOM text archives. IEEE Trans. Knowl. Data Eng. 16(3), 380–383 (2004)CrossRefGoogle Scholar
  23. 23.
    Liu, T., Loh, H.T., Sun, A.: Imbalanced text classification: A term weighting approach. Expert Syst. Appl. 36, 690–701 (2009)CrossRefGoogle Scholar
  24. 24.
    Lughofer, E.: Evolving fuzzy systems—methodologies, advanced concepts and applications. 1st edn. Springer (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  1. 1.Faculty of EngineeringUniversiti Malaysia SarawakSarawakMalaysia
  2. 2.Centre for Intelligent Systems ResearchDeakin UniversityGeelongAustralia

Personalised recommendations