A Semi-automated Entity Relation Extraction Mechanism with Weakly Supervised Learning for Chinese Medical Webpages
Medical entity relation extraction is of great significance for medical text data mining and medical knowledge graph. However, medical field requires very high data accuracy rate, the current medical entity relation extraction system is difficult to achieve the required accuracy. A main technical difficulty lies in how to obtain high-precision medical data, and automatically generate annotated training sample set. In this paper, a medical entity relation automatic extraction system based on weak supervision is proposed. At first, we designed a visual annotation tool, it can automatically generate crawl scripts, crawling the medical data from the site where the entity and its attributes are Separate stored. Then, based on the acquired data structure, we propose a weakly supervised hypothesis to automatically generate positive sample training data. Finally, we use CNN model to extract medical entity relation. Experiments show that the method is feasible and accurate.
KeywordsMedical entity Entity relation extraction Convolutional neural network Weakly supervised learning
This work is supported by the National Natural Science Foundation of China (61272110, 61602350), the Key Projects of National Social Science Foundation of China (11&ZD189), the State Key Lab of Software Engineering Open Foundation of Wuhan University (SKLSE2012-09-07) and NSF of Wuhan University of Science and technology Of China under grant number 2016xz016.
- 2.Kambhatl, N.A.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: The 42nd Annual Meeting on Association for Computational Linguistics on Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, Stroudsburg (2004)Google Scholar
- 3.Zhou, G.D., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: The 43rd Annual Meeting on Association for Computational Linguistics, pp, 427–434. Association for Computational Linguistics, Stroudsburg (2005)Google Scholar
- 4.Jiang, J., Zhai, C.X.: A systematic exploration of the feature space for relation extraction. In: Proceedings of Human Language Technologies 2007 and the North American Chapter of the Association for Computational Linguistics, pp. 113–120. Association for Computational Linguistics, Stroudsburg (2007)Google Scholar
- 6.Zhang, M., Zhang, J., Su, J., Zhou, G.D.: A composite kernel to extract relations between entities with both flat and structured features. In: The 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 825–832. Association for Computational Linguistics, Stroudsburg (2006)Google Scholar
- 7.Zhou, G.D., Zhang, M., Ji, D.H., Zhu, Q.M.: Tree kernel-based relation extraction with context-sensitive structured parse tree information. In: The 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 728–736 (2007)Google Scholar
- 8.Craven M., Kumlien J.: Constructing biological knowledge bases by extracting information from text sources. In: The 7th International Conference on Intelligent Systems for Molecular Biology, pp. 77–86. AAAI, Heidelberg(1999)Google Scholar
- 10.Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: The 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics, Stroudsburg (2012)Google Scholar
- 11.Xu, W., Hoffmann, R., Zhao, L., Grishman, R.: Filling knowledge base gaps for distant supervision of relation extraction. In: The 51st Annual Meeting of the Association for Computational Linguistics, pp. 665–670. Association for Computational Linguistics, Stroudsburg (2013)Google Scholar
- 12.Chen, Y., Geng, G.H., Jia, H.: Density center graph based weakly supervised classification algorithm. J. Comput. Eng. Appl. 6(51), 6–10 (2015)Google Scholar
- 14.Yih, W., He, X., Meek, C.: Semantic parsing for single- relation question answering. In: The Annual Meeting of the Association for Computational Linguistics, pp. 643–648 (2014)Google Scholar
- 15.Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408-5882 (2014)