Advertisement

Computation of Word Similarity Based on the Information Content of Sememes and PageRank Algorithm

  • Hao Li
  • Lingling MuEmail author
  • Hongying Zan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10085)

Abstract

Based on sememe structure of HowNet and PageRank algorithm, this article proposes a method to compute word similarity. Using depth information of HowNet as information content of sememes and considering sememe hyponymy, this method builds a transfer matrix and computes sememe vector with PageRank algorithm to obtain sememe similarity. Thus, the word similarity can be calculated by the sememe similarity. This method is tested on several groups of typical Chinese words and word sense classification of nouns in Contemporary Chinese Semantic Dictionary (CSD). The results show that the word similarity computed in this way quite conforms with the facts. It also shows a more accurate result in word sense classification of nouns in the CSD, reaching 71.9% consistency with the judgment of human.

Keywords

Word similarity HowNet Sememe PageRank Word sense classification 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Dong, Z., Dong, Q.: Hownet Literature [OL]. (1999). http://www.keenage.com. (In Chinese)
  2. 2.
    Page, L., Brin, S., Motwani, R., et al.: The PageRank Citation Ranking: Bringing Order to the Web. Technical Report, Stanford Digital Libraries (1998)Google Scholar
  3. 3.
    Fellbaum, C.: WordNet: An Electronic Lexical Database. Language, Speech, and Communication. The MIT Press (1998)Google Scholar
  4. 4.
    Liu, Q., Li, S.: Word similarity computing based on how-net(基于《知网》的词汇语义相似度计算). In: 3rd Workshop on Proceedings of Chinese Lexical Semantics, pp. 59–76 (2002). (in Chinese)Google Scholar
  5. 5.
    Wu, Z., Wang, Y.: A New Measure of Semantic Similarity Based on Hierarchical Network of Concept(基于HNC理论的词语相似度计算). 中文信息学报 02, 37–43+50 (2014). (in Chinese)Google Scholar
  6. 6.
    Jiule, T., Wei, Z.: Words Similarity Algorithm Based on TongyiciCilin in Semantic Web Adaptive Learning System(基于同义词词林的词语相似度计算方法). Journal of Jilin University (Information Science Edition) 06, 602–608 (2010). (in Chinese)Google Scholar
  7. 7.
    Snchez, D., Batet, M., Isern, D.: Ontology Based Information Content Computation. Journal on Knowledge-Based Systems 24(2), 297–303 (2011)CrossRefGoogle Scholar
  8. 8.
    Zhou, Z., Wang, Y., Gu, J.: A new model of information content for semantic similarity in WordNet. In: Second International Conference on Proceedings of Future Generation Communication and Networking Symposia, FGCNS 2008, vol. 3, pp. 85–89. IEEE (2008)Google Scholar
  9. 9.
    Singh, J., Saini, M., Siddiqi, S.: Graph Based Computational Model for Computing Semantic Similarity. Emerging Research in Computing, Information, Communication and Applications, ERCICA 2013, pp. 501–507 (2013)Google Scholar
  10. 10.
    Adhikari, A., Singh, S., Dutta, A., et al.: A novel information theoretic approach for finding semantic similarity in WordNet. In: IEEE Region 10th Conference on Proceedings of TENCON 2015–2015, pp. 1–6. IEEE (2015)Google Scholar
  11. 11.
    Haveliwala, T.H.: ToSIC-sensitive PageRank. In: Proceedings of the 11th International Conference on World Wide Web, pp. 517–526. ACM (2002)Google Scholar
  12. 12.
    Mu, L., Li, H., Zan, H., et al.: Proofreading and revision of the semantic classes in the contemporary chinese semantic dictionary. In: 16th Workshop on Proceedings of Chinese Lexical Semantics, CLSW 2015. Springer International Publishing, pp. 222–233 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.School of Information EngineeringZhengzhou UniversityZhengzhouChina

Personalised recommendations