A Chinese Question Answering System in Medical Domain
Question answering systems offer a friendly interface for human beings to interact with massive online information. It is time consuming for users to retrieve useful medical information with search engines among massive online websites. An effort is made to build a Chinese Question Answering System in Medical Domain (CQASMD) to provide useful medical information for users. A large medical knowledge base with more than 300 thousand medical terms and their descriptions is firstly constructed to store the structured medical knowledge data, and classified with the FastText model. Furthermore, a Word2Vec model is adopted to capture the semantic meanings of words, and the questions and answers are processed with sentence embedding to capture semantic context information. Users’ questions are firstly classified and processed into a sentence vector and a matching algorithm is adopted to match the most similar question. After querying the constructed medical knowledge base, the corresponding answers to previous questions are responded to users. The architecture and flowchart of CQASMD is proposed, which will play an important role in self disease diagnosis and treatment.
Key wordsquestion answering knowledge base FastText sentence embedding disease diagnosis
CLC numberTP 391
Unable to display preview. Download preview PDF.
- SOCHER R, BENGIO Y, MANNING C D. Deep learning for NLP (without magic) [C]//Tutorial Abstracts of ACL 2012. Jeju, Korea: ACL, 2012: 5–5.Google Scholar
- HANBURY A. Medical information retrieval: An instance of domain–specific search [C]//Proceedings of the 35th International ACM SIGIR Conference on Research and Development in Information Retrieval. Portland, OR, USA: ACM, 2012: 1191–1192.Google Scholar
- JOULIN A, GRAVE E, BOJANOWSKI P, et al. Bag of tricks for efficient text classification [EB/OL]. (2016–08–09). [2018–04–18]. https://arxiv.org/pdf/1607.01759.pdf.
- BOJANOWSKI P, GRAVE E, JOULIN A, et al. Enriching word vectors with subword information [EB/OL]. (2017–06–19).[2018–04–18]. https://arxiv.org/pdf/1607.04606.pdf.
- MIKOLOV T, SUTSKEVER I, CHEN K, et al. Distributed representations of words and phrases and their compositionality [EB/OL]. (2013–10–16). [2018–04–18]. https://arxiv.org/pdf/1310.4546.pdf.
- MIKOLOV T, CHEN K, CORRADO G, et al. Efficient estimation of word representations in vector space [EB/OL]. (2013–09–07). [2018–04–18]. https://arxiv.org/pdf/1301.3781v3.pdf.
- IYYER M, MANJUNATHA V, BOYD-GRABER J, et al. Deep unordered composition rivals syntactic methods for text classification [C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics. Beijing, China: ACL, 2015: 1681–1691.Google Scholar
- WANG S, MANNING C D. Baselines and bigrams: Simple, good sentiment and topic classification [C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics. Stroudsburg, PA, USA: ACL, 2012: 90–94.Google Scholar