Chinese medical question answer selection via hybrid models based on CNN and GRU

  • Yuteng Zhang
  • Wenpeng LuEmail author
  • Weihua Ou
  • Guoqiang Zhang
  • Xu Zhang
  • Jinyong Cheng
  • Weiyu Zhang


Question answer selection in the Chinese medical field is very challenging since it requires effective text representations to capture the complex semantic relationships between Chinese questions and answers. Recent approaches on deep learning, e.g., CNN and RNN, have shown their potential in improving the selection quality. However, these existing methods can only capture a part or one-side of semantic relationships while ignoring the other rich and sophisticated ones, leading to limited performance improvement. In this paper, a series of neural network models are proposed to address Chinese medical question answer selection issue. In order to model the complex relationships between questions and answers, we develop both single and hybrid models with CNN and GRU to combine the merits of different neural network architectures. This is different from existing works that can onpy capture partial relationships by utilizing a single network structure. Extensive experimental results on cMedQA dataset demonstrate that the proposed hybrid models, especially BiGRU-CNN, significantly outperform the state-of-the-art methods. The source codes of our models are available in the GitHub (


Question answer selection Chinese medical field Question answering system Convolutional neural network Gated recurrent unit 



The research work is supported by the National Key R&D Program of China under Grant No.2018YFC0831704, National Nature Science Foundation of China under Grant No.61502259 and No.61762021, Natural Science Foundation of Guizhou Province under Grant No.2017[1130], Key Subjects Construction of Guizhou Province under Grant No.ZDXK[2016]8, Natural Science Foundation of Shandong Province under Grant No.ZR2017MF056. Taishan Scholar Program of Shandong Province in China (Directed by Prof. Yinglong Wang)


  1. 1.
    Abacha AB, Zweigenbaum P (2012) Medical question answering: translating medical questions into sparql queries. In: Proceedings of the 2nd ACM SIGHIT international health informatics symposium, pp 41–50. ACMGoogle Scholar
  2. 2.
    Athenikos SJ, Han H, Brooks AD (2009) A framework of a logic-based question-answering system for the medical domain (loqas-med). In: Proceedings of the ACM symposium on applied computing, pp 847–851. ACM, p 2009Google Scholar
  3. 3.
    Bromley J, Guyon I, LeCun Y, Säckinger E, Shah R (1994) Signature verification using a ‘siamese’ time delay neural network. In: Advances in neural information processing systems, pp 737–744Google Scholar
  4. 4.
    Cairns BL, Nielsen RD, Masanz JJ, Martin JH, Palmer MS, Ward WH, Savova GK (2011) The mipacq clinical question answering system. In: AMIA annual symposium proceedings, vol 2011, pp 171. American medical informatics associationGoogle Scholar
  5. 5.
    Chao L (2016) Research and application on intelligent disease guidance and medical question answering method. Master’s thesis, Dalian University of TechnologyGoogle Scholar
  6. 6.
    Cho K, Van Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using rnn encoder-decoder for statistical machine translation. In: Methods in natural language processing (EMNLP), October 25-29, 2014, Doha, Qatar, pp 1724–1734.
  7. 7.
    Gao L, Guo Z, Zhang H, Xu X, Shen HT (2017) Video captioning with attention-based lstm and semantic consistency. IEEE Trans Multimedia 19(9):2045–2055CrossRefGoogle Scholar
  8. 8.
    Heilman M, Smith NA (2010) Tree edit models for recognizing textual entailments, paraphrases, and answers to questions. In: Human language technologies: conference of the North American chapter of the association of computational linguistics, pp 1011–1019Google Scholar
  9. 9.
    Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780CrossRefGoogle Scholar
  10. 10.
    Hu B, Lu Z, Li H, Chen Q (2014) Convolutional neural network architectures for matching natural language sentences. In: Advances in neural information processing systems, pp 2042–2050Google Scholar
  11. 11.
    Jain S, Dodiya T (2014) Rule based architecture for medical question answering system. In: Proceedings of the 2nd international conference on soft computing for problem solving (SocProS 2012), pp 1225–1233 SpringerGoogle Scholar
  12. 12.
    LeCun Y, Chopra S, Hadsell R, Ranzato M, Huang F (2006) A tutorial on energy-based learning. Predicting Structured Data 1:1–59.
  13. 13.
    Li S, Zhao Z, Hu R, Li W, Liu T, Du X (2018) Analogical reasoning on chinese morphological and semantic relations. In: Proceedings of the 56th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 138–143 Association for Computational LinguisticsGoogle Scholar
  14. 14.
    Lu H, Li Y, Chen M, Kim H, Serikawa S (2018) Brain intelligence: go beyond artificial intelligence. Mob Netw Appl 23(2):368–375CrossRefGoogle Scholar
  15. 15.
    Lu H, Li Y, Uemura T, Ge Z, Xu X, He L, Serikawa S, Kim H (2017) Fdcnet: filtering deep convolutional network for marine organism classification. Multimed Tools Appl 77(2):1–14Google Scholar
  16. 16.
    Lu H, Li Y, Uemura T, Kim H, Serikawa S (2018) Low illumination underwater light field images reconstruction using deep convolutional neural networks. Future Generation Computer SystemsGoogle Scholar
  17. 17.
    Lu W (2018) Word sense disambiguation based on dependency constraint knowledge. Clust Comput, pp 1–9.
  18. 18.
    Lu W, Huang H, Zhu C (2012) Feature words selection for knowledge-based word sense disambiguation with syntactic parsing. Przeglad Elektrotechniczny 88(1b):82–87Google Scholar
  19. 19.
    Lu W, Wu H, Jian P, Huang Y, Huang H (2018) An empirical study of classifier combination based word sense disambiguation. IEICE Trans Inf Syst 101(1):225–233CrossRefGoogle Scholar
  20. 20.
    Mihalcea R, Textrank PT (2004) Bringing order into text. In: Proceedings of the 2004 conference on empirical methods in natural language processingGoogle Scholar
  21. 21.
    Moschitti A, Quarteroni S (2011) Linguistic kernels for answer re-ranking in question answering systems. Inf Process Manag 47(6):825–842CrossRefGoogle Scholar
  22. 22.
    Qiu X, Huang X (2015) Convolutional neural tensor network architecture for community-based question answering. In: Proceedings of international joint conferences on artificial intelligence, pp 1305–1311Google Scholar
  23. 23.
    Robertson S, Zaragoza H et al (2009) The probabilistic relevance framework: Bm25 and beyond. Foundations and Trends® in Information Retrieval 3(4):333–389CrossRefGoogle Scholar
  24. 24.
    Tan M, dos Santos C, Xiang B, Zhou B (2015) Lstm-based deep learning models for non-factoid answer selection. arXiv:1511.04108
  25. 25.
    Toba H, Ming Z-Y, Adriani M, Chua T-S (2014) Discovering high quality answers in community question answering archives using a hierarchy of classifiers. Inf Sci 261:101–115MathSciNetCrossRefGoogle Scholar
  26. 26.
    Tymoshenko K, Bonadiman D, Moschitti A (2016) Convolutional neural networks vs. convolution kernels: feature engineering for answer sentence reranking. In: Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 1268–1278Google Scholar
  27. 27.
    Wang B, Niu J, Ma L, Zhang Y, Zhang L, Li J, Zhang P, Song D (2016) A chinese question answering approach integrating count-based and embedding-based features. In: Natural language understanding and intelligent applications, pp 934–941. SpringerGoogle Scholar
  28. 28.
    Wang J, Man C, Zhao Y, Wang F (2016) An answer recommendation algorithm for medical community question answering systems. In: IEEE International Conference on Service Operations and Logistics, and Informatics, pp 139–144Google Scholar
  29. 29.
    Wang S, Cao L (2017) Inferring implicit rules by learning explicit and hidden item dependency. IEEE Transactions on Systems, Man, and Cybernetics: Systems.
  30. 30.
    Wang S, Hu L, Cao L, Huang X, Lian D, Liu W (2018) Attention-based transactional context embedding for next-item recommendationGoogle Scholar
  31. 31.
    Wang S, Liu W, Wu J, Cao L, Meng Q, Kennedy PJ (2016) Training deep neural networks on imbalanced data sets. In: International joint conference on neural networks (IJCNN), pp 4368–4374. IEEE, p 2016Google Scholar
  32. 32.
    Xiang L, Yu J, Yang C, Zeng D, Shen X (2018) A word-embedding-based steganalysis method for linguistic steganography via synonym substitution. IEEE Access 6:64131–64141CrossRefGoogle Scholar
  33. 33.
    Xu X, He L, Lu H, Gao L, Ji Y (2018) Deep adversarial metric learning for cross-modal retrieval. World Wide Web, pp 1–16.
  34. 34.
    Xu X, He L, Shimada A, Taniguchi R-I, Lu Hu (2016) Learning unified binary codes for cross-modal retrieval via latent semantic hashing. Neurocomputing 213:191–203CrossRefGoogle Scholar
  35. 35.
    Xu X, Shen F, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process Publication IEEE Signal Process Soc 26(5):2494–2507MathSciNetCrossRefGoogle Scholar
  36. 36.
    Xu X, Song J, Lu H, He L, Yang Y, Shen F (2018) Dual learning for visual question generation. In: IEEE international conference on multimedia and expo (ICME), pp 1–6. IEEEGoogle Scholar
  37. 37.
    Yao X, Van Durme B, Callison-Burch C, Clark P (2013) Answer extraction as sequence tagging with tree edit distance. In: Proceedings of the 2013 conference of the North American chapter of the association for computational linguistics: human language technologies, pp 858–867Google Scholar
  38. 38.
    Yen S-J, Wu Y-C, Yang J-C, Lee Y-S, Lee C-J, Liu J-J (2013) A support vector machine-based context-ranking model for question answering. Inf Sci 224:77–87CrossRefGoogle Scholar
  39. 39.
    Yin W, Schütze H, Xiang B, Zhou B (2015) Abcnn: Attention-based convolutional neural network for modeling sentence pairs. arXiv:1512.05193
  40. 40.
    Yu H, Lee M, Kaufman D, Ely J, Osheroff JA, Hripcsak G, Cimino J (2007) Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians. J Biomed Inform 40(3):236–251CrossRefGoogle Scholar
  41. 41.
    Yuan L, Yuan A, Hasan S (2017) Improving clinical diagnosis inference through integration of structured and unstructured knowledge. In: Proceedings of the 1st workshop on sense, concept and entity representations and their applications, pp 31–36Google Scholar
  42. 42.
    Zhang M, Zhang Y, Che W, Liu T (2014) Character-level chinese dependency parsing. In: Proceedings of the 52nd annual meeting of the association for computational linguistics (Volume 1: Long Papers), vol 1, pp 1326–1336Google Scholar
  43. 43.
    Zhang S, Zhang X, Wang H, Cheng J, Li P, Ding Z (2017) Chinese medical question answer matching using end-to-end character-level multi-scale cnns. Appl Sci 7(8):767CrossRefGoogle Scholar
  44. 44.
    Zhou Q, Yang W, Gao G, Ou W, Lu H, Chen J, Latecki LJ (2018) Multi-scale deep context convolutional neural networks for semantic segmentation. World Wide Web, pp 1–16.
  45. 45.
    Zhou Q, Zheng B, Zhu W, Latecki LJ (2016) Multi-scale context for scene labeling via flexible segmentation graph. Pattern Recogn 59:312–324CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.School of Computer Science and TechnologyQiLu University of Technology (Shandong Academy of Sciences)JinanChina
  2. 2.School of Big Data and Computer ScienceGuizhou Normal UniversityGuiyangChina
  3. 3.Centre for Audio, Acoustics and VibrationUniversity of Technology SydneySydneyAustralia

Personalised recommendations