Advertisement

A Semi-automated Entity Relation Extraction Mechanism with Weakly Supervised Learning for Chinese Medical Webpages

  • Zhao Liu
  • Jian Tong
  • Jinguang GuEmail author
  • Kai Liu
  • Bo Hu
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10219)

Abstract

Medical entity relation extraction is of great significance for medical text data mining and medical knowledge graph. However, medical field requires very high data accuracy rate, the current medical entity relation extraction system is difficult to achieve the required accuracy. A main technical difficulty lies in how to obtain high-precision medical data, and automatically generate annotated training sample set. In this paper, a medical entity relation automatic extraction system based on weak supervision is proposed. At first, we designed a visual annotation tool, it can automatically generate crawl scripts, crawling the medical data from the site where the entity and its attributes are Separate stored. Then, based on the acquired data structure, we propose a weakly supervised hypothesis to automatically generate positive sample training data. Finally, we use CNN model to extract medical entity relation. Experiments show that the method is feasible and accurate.

Keywords

Medical entity Entity relation extraction Convolutional neural network Weakly supervised learning 

Notes

Acknowledgement

This work is supported by the National Natural Science Foundation of China (61272110, 61602350), the Key Projects of National Social Science Foundation of China (11&ZD189), the State Key Lab of Software Engineering Open Foundation of Wuhan University (SKLSE2012-09-07) and NSF of Wuhan University of Science and technology Of China under grant number 2016xz016.

References

  1. 1.
    Sarawagi, S.: Information extraction. J. Found. Trends Databases 3(1), 261–377 (2008)zbMATHGoogle Scholar
  2. 2.
    Kambhatl, N.A.: Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations. In: The 42nd Annual Meeting on Association for Computational Linguistics on Interactive Poster and Demonstration Sessions, Association for Computational Linguistics, Stroudsburg (2004)Google Scholar
  3. 3.
    Zhou, G.D., Su, J., Zhang, J., Zhang, M.: Exploring various knowledge in relation extraction. In: The 43rd Annual Meeting on Association for Computational Linguistics, pp, 427–434. Association for Computational Linguistics, Stroudsburg (2005)Google Scholar
  4. 4.
    Jiang, J., Zhai, C.X.: A systematic exploration of the feature space for relation extraction. In: Proceedings of Human Language Technologies 2007 and the North American Chapter of the Association for Computational Linguistics, pp. 113–120. Association for Computational Linguistics, Stroudsburg (2007)Google Scholar
  5. 5.
    Zelenko, D., Aone, C., Richardella, A.: Kernel methods for relation extraction. J. Mach. Learn. Res. 3, 1083–1106 (2003)MathSciNetzbMATHGoogle Scholar
  6. 6.
    Zhang, M., Zhang, J., Su, J., Zhou, G.D.: A composite kernel to extract relations between entities with both flat and structured features. In: The 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, pp. 825–832. Association for Computational Linguistics, Stroudsburg (2006)Google Scholar
  7. 7.
    Zhou, G.D., Zhang, M., Ji, D.H., Zhu, Q.M.: Tree kernel-based relation extraction with context-sensitive structured parse tree information. In: The 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 728–736 (2007)Google Scholar
  8. 8.
    Craven M., Kumlien J.: Constructing biological knowledge bases by extracting information from text sources. In: The 7th International Conference on Intelligent Systems for Molecular Biology, pp. 77–86. AAAI, Heidelberg(1999)Google Scholar
  9. 9.
    Riedel, S., Yao, L., McCallum, A.: Modeling relations and their mentions without labeled text. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS, vol. 6323, pp. 148–163. Springer, Heidelberg (2010). doi: 10.1007/978-3-642-15939-8_10 CrossRefGoogle Scholar
  10. 10.
    Surdeanu, M., Tibshirani, J., Nallapati, R., Manning, C.D.: Multi-instance multi-label learning for relation extraction. In: The 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pp. 455–465. Association for Computational Linguistics, Stroudsburg (2012)Google Scholar
  11. 11.
    Xu, W., Hoffmann, R., Zhao, L., Grishman, R.: Filling knowledge base gaps for distant supervision of relation extraction. In: The 51st Annual Meeting of the Association for Computational Linguistics, pp. 665–670. Association for Computational Linguistics, Stroudsburg (2013)Google Scholar
  12. 12.
    Chen, Y., Geng, G.H., Jia, H.: Density center graph based weakly supervised classification algorithm. J. Comput. Eng. Appl. 6(51), 6–10 (2015)Google Scholar
  13. 13.
    Collobert, R., Weston, J., Bottou, L.: Natural language processing (almost) from scratch. J. Mach. Learn. Res. 12, 2493–2537 (2011)zbMATHGoogle Scholar
  14. 14.
    Yih, W., He, X., Meek, C.: Semantic parsing for single- relation question answering. In: The Annual Meeting of the Association for Computational Linguistics, pp. 643–648 (2014)Google Scholar
  15. 15.
    Kim, Y.: Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408-5882 (2014)
  16. 16.
    Zou, Y.W., Gu, J.G., Fu, H.D.: EARES: medical entity and attribute extraction system based on relation annotation. Wuhan Univ. J. Nat. Sci. 21(2), 145–150 (2016)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Zhao Liu
    • 1
    • 2
  • Jian Tong
    • 1
    • 2
  • Jinguang Gu
    • 1
    • 2
    Email author
  • Kai Liu
    • 1
    • 2
  • Bo Hu
    • 3
  1. 1.College of Computer Science and TechnologyWuhan University of Science and TechnologyWuhanChina
  2. 2.Hubei Province Key Laboratory of Intelligent Information Processing and Real-Time Industrial SystemWuhanChina
  3. 3.Kingdee Cloud Platform DepartmentKingdee International Software Group Co., Ltd.ShenzhenChina

Personalised recommendations