A Semantic Matching of Information Segments for Tolerating Error Chinese Words

  • Maoyuan Zhang
  • Chunyan Zou
  • Zhengding Lu
  • Zhigang Wang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4255)


There exist new words and error words in Chinese information of web pages. In this paper, we introduce our definition of semantic similarity between sememes and their theorems. On the base of proving the theorems, the influence of the parameter is analyzed. Moreover, this paper presents a novel definition of the word similarity based on the sememe similarity, which can be used to match the new Chinese words with the existing Chinese words and match the error Chinese words with correct Chinese words. And also, based on the novel word similarity, a matching method of information segments is presented to recognize the category of Chinese web information segments, in which new words and error words occur. In addition, the experiment of the matching methods is presented. Therefore, the novel matching method is an efficient method both in theory and from experimental results.


Semantic Similarity Semantic Relation Match Method Semantic Network Information Item 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Gao, M., Liu, C., Chen, F.: An ontology search engine based on semantic analysis. In: 3rd International Conference on Information Technology and Applications, Sydney, Australia, pp. 256–259 (2005)Google Scholar
  2. 2.
    Yang, J., Cheung, W.K., Chen, X.: Integrating element and term semantics for similarity-based XML document clustering. In: The 2005 IEEE/WIC/ACM International Conference on Web Intelligence, University of Technology of Compiegne, France, pp. 222–228. IEEE Computer Society Press, Los Alamitos (2005)CrossRefGoogle Scholar
  3. 3.
    Da, L.G., Facon, J., Borges, D.L.: Visual speech recognition: a solution from feature extraction to words classification. In: Proceeding of Symposium on Computer Graphics and Image Processing, XVI Brazilian, pp. 399–405 (2003)Google Scholar
  4. 4.
    Shen, H.T., Shu, Y., Yu, B.: Efficient semantic-based content search in P2P network. IEEE Transactions on Knowledge and Data Engineering 16(7), 813–826 (2004)CrossRefGoogle Scholar
  5. 5.
    Yi, S., Huang, B., Weng, T.: XML application schema matching using similarity measure and relaxation labeling. Information Sciences 169(1-2), 27–46 (2005)CrossRefMATHGoogle Scholar
  6. 6.
    Nakashima, T.: Classification of characteristic words of electronic newspaper based on the directed relation. In: 2001 IEEE Pacific Rim Conference on Communications, Computers and signal Processing, Victoria, BC, Canada, pp. 591–594. IEEE Computer Society Press, Los Alamitos (2001)Google Scholar
  7. 7.
    Rada, R., Mili, H., Bichnell, E., Blettner, M.: Development and application of a metric on semantic nets. IEEE Transaction on Systems, Man, and Cybernetics 9(1), 17–30 (1989)CrossRefGoogle Scholar
  8. 8.
    Cross, V.: Fuzzy semantic distance measures between ontological concepts, 2004. In: Processing NAFIPS 2004. IEEE Annual Meeting of the Fuzzy Information, Alberta, Canada, pp. 635–640. IEEE Computer Society Press, Los Alamitos (2004)CrossRefGoogle Scholar
  9. 9.
    Soo, V., Yang, S., Chen, S., Fu, Y.: Ontology acquisition and semantic retrieval from semantic annotated Chinese poetry. In: Proceedings of the 2004 Joint ACM/IEEE Conference on Digital Libraries, Tuscon, AZ, USA, pp. 345–346. IEEE Computer Society Press, Los Alamitos (2004)CrossRefGoogle Scholar
  10. 10.
    Vladimir, A.O.: Ontology based semantic similarity comparison of documents. In: Mařík, V., Štěpánková, O., Retschitzegger, W. (eds.) DEXA 2003. LNCS, vol. 2736, pp. 735–738. Springer, Heidelberg (2003)Google Scholar
  11. 11.
    Cheng, L., Lu, Z., Wen, K.: The exploration and application about amphibolous matching based on semantics. Journal Huazhong University of Science & Technology (Nature Science Edition) 31(2), 23–25 (2003)Google Scholar
  12. 12.
    Rodriguez, M.A., Egenhofer, M.J.: Determining semantic similarity among entity classes from different ontologies. IEEE Transactions on Knowledge and Data Engineering 15(2), 442–456 (2003)CrossRefGoogle Scholar
  13. 13.
    Guan, Y., Wang, X., Kong, X., Zhao, J.: Quantifying semantic similarity of Chinese words from HowNet. In: Proceedings of the First International Conference on Machine Learning and Cybernetics, Beijing, China, pp. 234–239. IEEE Computer Society, Los Alamitos (2002)CrossRefGoogle Scholar
  14. 14.
    Zhang, M.Y., Lu, Z.D., Zou, C.Y.: A Chinese word segmentation based on language situation in processing ambiguous words. Information Sciences 162(3–4), 275–285 (2004)CrossRefMATHGoogle Scholar
  15. 15.
    Zhang, M.Y., Lu, Z.D.: A fuzzy classification based on feature selection for web pages. In: The 2004 IEEE/WIC/ACM International Conference on Web intelligence, Beijing, China, pp. 469–472. IEEE Computer Society Press, Los Alamitos (2004)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Maoyuan Zhang
    • 1
    • 3
  • Chunyan Zou
    • 2
  • Zhengding Lu
    • 1
  • Zhigang Wang
    • 1
  1. 1.Department of Computer Science and TechnologyHuaZhong University of Science and TechnologyWuhanP.R. China
  2. 2.School of Foreign LanguagesHuaZhong Normal UniversityWuhanP.R. China
  3. 3.Schoole of ManagementHuaZhong University of Science and TechnologyWuhanP.R. China

Personalised recommendations