Non-hierarchical Relation Extraction of Chinese Text Based on Scalable Corpus
As for ontology construction from Chinese text, the non-hierarchical relation extraction is harder than the concept extraction and its extraction effect is still not satisfactory. In this paper, we put forward a scalable corpus model, which uses Tongyici Cilin and word2vec to calculate terms’ similarity and add the qualified candidate terms to the corpora. In this way we can expand the scalable corpus while extracting non-hierarchical relations. In turn, the scalable corpus that has been expanded with the new terms will facilitate the non-hierarchical relation extraction further. We carry out the experiment with Chinese texts in the domain of Computer, whose results show that with expansion of the corpus, the extraction effect will be better and better.
KeywordsRelation extraction Scalable corpus Chinese text
Hai Wan’s research was in part supported by the National Natural Science Foundation of China under grant 61573386, Natural Science Foundation of Guangdong Province under grant 2016A030313292, Guangdong Province Science and Technology Plan projects under grant 2016B030305007, and Sun Yat-sen University Young Teachers Cultivation Project under grant 16lgpy40.
- 4.He, H., Shanhong, Z., et al.: Research on domain ontology the concept extraction based on association rule and semantic rules. J. Jilin Univ. (Info. Sci. Edt.) 32(06), 657–663 (2014)Google Scholar
- 5.Hearst, M.A.: Automatic acquisition of hyponyms on large text corpora. In: Proceedings of the 14th International Conference on Computational Linguistics, pp. 539–545, Nantes, France (1992)Google Scholar
- 6.Buitelaar, P., Daniel, O., et al.: A Protege plug-in for ontology extraction from text based on linguistic analysis. In: Proceedings of the 1st European Semantic Web Symposium (2004)Google Scholar
- 7.Gu, J., Yan, M., et al.: Research on ontology relation acquisition based on improved association rule. Info. Stud. Theo. Appl. 34(12), 121–125 (2011)Google Scholar
- 8.Yu, F., Cheng, H., et al.: Non-hierarchical relations extraction of chinese texts based on grammar rules and improved association rules. Lib. Info. Ser. 57(22), 126–131 (2013)Google Scholar
- 9.Zhang, Y., Yang, F., et al.: Study on context based domain ontology the concept extraction and the relation extraction. Appl. Res. Comput. 27(1), 74–76 (2010)Google Scholar
- 10.Tian, J., Zhao, W.: The method of word similarity calculation based on synonym word lin. J. Jilin. Univ. 28(6), 602–608 (2010)Google Scholar
- 11.Agrawal, R., Ramakrishnan, S.: Fast algorithms for mining association rule in large databases. In: Proceedings of the 20th International Conference on Very Large Data Bases, pp. 487–499. VLDB (1994)Google Scholar
- 12.Zhang, Y., Yang, F.: Study on context based domain ontology the concept extraction and the relation extraction. Appl. Res. Comput. 27(1), 74–76 (2010)Google Scholar
- 13.Mikolov, T., Chen et al.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)