An Approach to Acquire Semantic Relationships Between Words from Web Document
In this paper, we focus on the semantic relationships acquisition from Chinese web documents motivated by the large requirement of web question answering system in e-Learning. With our scheme, we dwindle in numbers of text to be analyzed and obtain initial sentence-level text in pre-process phase. Then linguistic rules, which are broken down into unambiguous and ambiguous, designed for Chinese phrases are applied to these sentence-level text to extract the synonymy relationship, hyponymy relationship, hypernymy relationship and parataxis relationship. Lastly, candidates are refined using two heuristics. Compared to other previous works, we apply not only strict unambiguous linguistic rules but also loose ambiguous linguistic rules to extract relationships and proposed efficient approach to refine the outputs of these rules. Experiments show that this method can acquire semantic relationships efficiently and effectively.
Unable to display preview. Download preview PDF.
- 2.Girju, R., Badulescu, A., Moldovan, D.: Learning Semantic Constraints for the Automatic Discovery of Part-Whole Relations. In: Proceedings of HLT-NAACL (2003)Google Scholar
- 3.Gildea, D., Jurafsky, D.: Automatically Labeling Semantic classes. In: Proceedings of Annual Conference of the Association for Computational Linguistics, ACL (2004)Google Scholar
- 4.Pantel, P., Lin, D.: Discovering Word Senses from Text. In: Proceedings of ACM Conference on Knowledge Discovery and Data Mining, SIGKDD (2002)Google Scholar
- 5.Matthew, B., Eugene, C.: Finding Parts in Very Large Corpora. In: Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, Maryland (1999)Google Scholar
- 6.Li, Y., He, Q., Shi, Z.: Association Retrieve Based On Concept Semantic Space. Journal of University of Science and Technology, Beijing (2001)Google Scholar
- 8.Google, http://www.google.com/
- 9.Wu, P., Chen, Q., Ma, L.: The Study on Large Scale Duplicated Web Pages of Chinese Fast Deletion Algorithm Based on String of Feature Code. Journal of Chinese Information Processing, Beijing (2003)Google Scholar
- 10.Zheng, Q., Zhang, S.: A Novel Algorithm of Eliminating the Chinese Word Segmentation Ambiguities for Web Answer. Computer Engineering and Applications (2004)Google Scholar
- 11.Wang, Z., Zheng, Q.: An Approach of POS Tagging for Web Answer. Computer Engineering and Applications (2004)Google Scholar
- 12.Sun, X., Zheng, Q.: A Method of Special Domain Lexicon Construction Based on Raw Materials. Mini-Micro Systems (2005)Google Scholar