Elimination Method Study of Ambiguous Words in Chinese Automatic Indexing

  • Wang Dan
  • Yang Xiaorong
  • Zhang Jie
Conference paper
Part of the IFIP Advances in Information and Communication Technology book series (IFIPAICT, volume 420)

Abstract

Faced with huge amounts of information to realize the accurate retrieval under the network environment, the first step is indexing words cannot appear ambiguity word. Because Chinese’s the basic unit is Chinese characters, Chinese characters form words, Word is divided into monosyllabic word and compound word, and there’s no space between Chinese keywords and there are a lot of ambiguous concept. Therefore a lot of ambiguity in the indexing process will be produced. The result detected information of irrelevant or mistakenly identified. The paper focuses on a method to eliminating the crossed meanings ambiguous words in the automatic indexing. The paper puts forward a method to eliminating ambiguous words combined algorithm of exhaustive method and disambiguation rules. Experiments show that it can avoid a great lot segmenting ambiguities with better segmenting results.

Keywords

Chinese text Automatic indexing Keyword extraction Ambiguous words Elimination algorithm 

References

  1. 1.
    Li, D., Cao, Y., Wan, Y.: New Security Feature Extraction Method Based on Association Rules. Computer Engineering and Applications (S1), 105–107 (2006)Google Scholar
  2. 2.
    Xiao, H., Xu, S.-H.: A Method of Automatic Keyword Extraction based on Co-occurrence Model. Transactions of Shenyang Ligong University (5), 38–41 (2009)Google Scholar
  3. 3.
    Su, X., Liu, X., Shao, P.: The Word-indexand Position Retrievalforthe Document TitlesIn Chinese. Journal of Nanjing University(Natural Sciences Edition) (2), 329–333 (1990)Google Scholar
  4. 4.
    Weng, H.: Comparison Studies on Inconsistencies and Ambiguity Automatic Identification Method in Chinese Information Processing. Language Applied Research (12), 93–94 (2006)Google Scholar
  5. 5.
    Li, G., Liu, K., Zhang, Y.: Segmentating Chinese Word and Processing Different Meanings Structure. Journal of Chinese Information Processing (3), 27–32 (1988)Google Scholar
  6. 6.
    Yao, J.-W., Zhao, D.: Disambiguation Method in Chinese Word Segmentation Based on Phrase Match. Journal of Jilin University(Science Edition) 48(3), 427–432 (2010)MathSciNetGoogle Scholar
  7. 7.
    Bai, S.: Chinese word segmentation and POS integrated approach to automatic annotation. In: Advances in Computational Linguistics and Applied, Beijing, pp. 56–61. Tsinghua University Press (1995)Google Scholar
  8. 8.
    Cai, J.: "Chinese Library Classification" professional classification "Agricultural Professional Classification". Beijing. Library Press (October 1999)Google Scholar

Copyright information

© IFIP International Federation for Information Processing 2014

Authors and Affiliations

  • Wang Dan
    • 1
    • 2
  • Yang Xiaorong
    • 1
    • 2
  • Zhang Jie
    • 1
    • 2
  1. 1.Institute of Agricultural InformationChinese Academy of Agricultural SciencesBeijingChina
  2. 2.Key Laboratory of Agricultural Information Service Technology (2006-2010)Ministry of AgricultureThe People‘s Republic of China

Personalised recommendations