Advertisement

Concept Features Extraction and Text Clustering Analysis of Neural Networks Based on Cognitive Mechanism

  • Lin Wang
  • Minghu Jiang
  • Shasha Liao
  • Beixing Deng
  • Chengqing Zong
  • Yinghua Lu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4113)

Abstract

The feature selection is an important part in automatic classification. In this paper, we use the HowNet to extract the concept attributes, and propose CHI-MCOR method to build a feature set. This method not only selects the highly occurring words, but also selects the word whose occurrence frequency is middle or low occurring words that are important for text classification. The combined method is much better than any one of the weight methods. Then we use the Self-Organizing Map (SOM) to realize automatic text clustering. The experiment result shows that if we can extract the sememes properly, we can not only reduce the feature dimension but also improve the classification precise. SOM can be used in text clustering in large scales and the clustering results are good when the concept feature is selected.

Keywords

Natural Language Processing Text Classification Chinese Word Concept Attribute Concept Feature 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Li, P., Jin, Z., Tan, L.H.: Neural Representations of Nouns and Verbs in Chinese: an fMRI Study. NeuroImage 21, 1533–1541 (2004)CrossRefGoogle Scholar
  2. 2.
  3. 3.
    Jiang, M., Cai, H., Zhang, B.: Self-organizing Map Analysis Consistent with Neuroimaging for Chinese Noun, Verb and Class-ambiguous Word. In: Wang, J., Liao, X.-F., Yi, Z. (eds.) ISNN 2005. LNCS, vol. 3498, pp. 971–976. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  4. 4.
    Liao, S., Jiang, M.: An Improved Method of Feature Selection Based on Concept Attributes in Text Classification. In: Wang, L., Chen, K., S. Ong, Y. (eds.) ICNC 2005. LNCS, vol. 3610, pp. 1140–1149. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  5. 5.
    Davies, D., Bouldin, D.: A Cluster Separation Measure. IEEE Transactions on Pattern Analysis and Machine Intelligence - I 2, 224–227 (1979)CrossRefGoogle Scholar
  6. 6.
    Kohonen, T.: The Self-organnized Map. Proceedings of the IEEE 78, 1464–1480 (1990)CrossRefGoogle Scholar
  7. 7.
    Vesanto, J., Alhoniemi, J.: Clustering of the Self-organizing Map. IEEE Transactions on Neural Networks 11(3), 586–600 (2000)CrossRefGoogle Scholar
  8. 8.
    Wang, L., Jiang, M., Lu, Y., et al.: Self-organizing Map Clustering Analysis for Molecular Data. In: Wang, J., Yi, Z., Żurada, J.M., Lu, B.-L., Yin, H. (eds.) ISNN 2006. LNCS, vol. 3971, pp. 1250–1255. Springer, Heidelberg (2006)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Lin Wang
    • 1
  • Minghu Jiang
    • 1
    • 2
  • Shasha Liao
    • 2
  • Beixing Deng
    • 3
  • Chengqing Zong
    • 4
  • Yinghua Lu
    • 1
  1. 1.School of Electronic Eng.Beijing Univ. of Post and TelecomBeijingChina
  2. 2.Lab of Computational Linguistics, School of Humanities and Social SciencesTsinghua UniversityBeijingChina
  3. 3.Dept. of Electronic Eng.Tsinghua UniversityBeijingChina
  4. 4.State Key Lab of Pattern Recognition, Institute of AutomationChinese Academy of ScienceBeijingChina

Personalised recommendations