Using Deep Learning with Canadian Primary Care Data for Disease Diagnosis

Zafari, Hasan; Kosowan, Leanne; Lam, Jason T.; Peeler, William; Gasmallah, Mohammad; Zulkernine, Farhana; Singer, Alexander

doi:10.1007/978-3-030-71676-9_12

Hasan Zafari²,
Leanne Kosowan³,
Jason T. Lam²,
William Peeler³,
Mohammad Gasmallah²,
Farhana Zulkernine² &
…
Alexander Singer³

1202 Accesses
1 Citations

Abstract

The majority of Canadian primary care systems record patient data in the form of Electronic Medical Records (EMR). EMRs hold structured, semi-structured and unstructured demographic and health care data about patients. The value of EMR data for research, health surveillance and quality improvement continues to be explored. Data analytics such as Machine Learning (ML) and statistical modeling techniques have been applied to de-identified EMR data repositories to advance our understanding of different health conditions and patient care. More recently, the application of Deep Learning (DL) approaches to structured, semi-structured and unstructured data of the EMRs is being investigated as an avenue for improved identification of health conditions. Supervised ML methods have dominated disease classification for more prevalent diseases. A large cohort of labeled data is required to train ML models using supervised learning methods. For less common diseases, the amount of available labeled data is often insufficient, and a variety of strategies are being explored to deal with inadequate, noisy and missing data. This chapter describes the benefits of using DL models with EMR data for research to improve provisioning of health care in primary care settings. A few prominent DL models such as Multi-Layered Perceptron (MLP), Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) are discussed with example scenarios that demonstrate application of some of these predictive analytics models to both structured and unstructured EMR data using regular and weak supervision methods for diagnosing both prevalent and non-prevalent diseases.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chang, F., & Gupta, N. (2015). Progress in electronic medical record adoption in Canada. Canadian Family Physician, 61(12), 1076-1084.
PubMed Central Google Scholar
Marrie, R. A., Kosowan, L., Taylor, C., & Singer, A. (2019). Identifying people with multiple sclerosis in the Canadian primary care sentinel surveillance network. Multiple Sclerosis Journal–Experimental, Translational and Clinical
Book Google Scholar
Cave, A. J., Davey, C., Ahmadi, E., Drummond, N., Fuentes, S., Kazemi-Bajestani, S. M. R., ... & Taylor, M. (2016). Development of a validated algorithm for the diagnosis of paediatric asthma in electronic medical records. NPJ primary care respiratory medicine, 26(1), 1-4.
Google Scholar
Kosowan, L., Wicklow, B., Queenan, J., Yeung, R., Amed, S., & Singer, A. (2019). Enhancing Health Surveillance: Validation of a Novel Electronic Medical Records-Based Definition of Cases of Pediatric Type 1 and Type 2 Diabetes Mellitus. Canadian journal of diabetes, 43(6), 392-398.
Article PubMed Google Scholar
Williamson, T., Green, M. E., Birtwhistle, R., Khan, S., Garies, S., Wong, S. T., ... & Drummond, N. (2014). Validating the 8 CPCSSN case definitions for chronic disease surveillance in a primary care database of electronic health records. The Annals of Family Medicine, 12(4), 367-372.
Google Scholar
Singer, A., Kosowan, L., Katz, A., Ronksley, P., McBrien, K., Halas, G., & Williamson, T. (2020). Characterizing patients with high use of the primary and tertiary care systems: A retrospective cohort study. Health Policy, 124(3), 291-297.
Article PubMed Google Scholar
Zafari, H.,Langlois, S.,Zulkernine, F., Kosowan, L., & Singer, A. (2020). Predicting Chronic Obstructive Pulmonary Disease from EMR data. International Conference on Computational Intelligence in Bioinformatics and Computational Biology.
Book Google Scholar
Birtwhistle, R. V. (2011). Canadian Primary Care Sentinel Surveillance Network: A developing resource for family medicine and public health. Canadian Family Physician, 57(10), 1219-1220.
Google Scholar
Queenan, J. A., Williamson, T., Khan, S., Drummond, N., Garies, S., Morkem, R., & Birtwhistle, R. (2016). Representativeness of patients and providers in the Canadian Primary Care Sentinel Surveillance Network: a cross-sectional study. CMAJ open, 4(1), E28.
Google Scholar
TCPS-2. (2014). Canadian Institutes of Health Research, Natural Sciences and Engineering Research Council of Canada, and Social Sciences and Humanities Research Council of Canada. Tri-Council Policy Statement: Ethical Conduct for Research Involving Humans.
Google Scholar
Kotecha, J. A., Manca, D., Lambert-Lanning, A., Keshavjee, K., Drummond, N., Godwin, M., ... & Birtwhistle, R. (2011). Ethics and privacy issues of a practice-based surveillance system: Need for a national-level institutional research ethics board and consent standards. Canadian Family Physician, 57(10), 1165-1173.
Google Scholar
Oake, J., Aref-Eshghi, E., Godwin, M., Collins, K., Aubrey-Bassler, K., Duke, P., ... & Asghari, S. (2017). Using electronic medical record to identify patients with dyslipidemia in primary care settings: international classification of disease code matters from one region to a national database. Biomedical informatics insights, 9, 1178222616685880.
Google Scholar
Bello, A. K., Ronksley, P. E., Tangri, N., Kurzawa, J., Osman, M. A., Singer, A., ... & Lindeman, C. (2019). Prevalence and demographics of CKD in Canadian primary care practices: a cross-sectional study. Kidney international reports, 4(4), 561-570.
Google Scholar
Queenan, J. A., Farahani, P., Ehsani-Moghadam, B., & Birtwhistle, R. V. (2018). The prevalence and risk for herpes zoster infection in adult patients with diabetes mellitus in the Canadian Primary Care Sentinel Surveillance Network. Canadian journal of diabetes, 42(5), 465-469.
Google Scholar
Zafari, H.,Zulkernine, Singer, A., & Kosowan, L. (2019). Weakly Supervised Text Classification for Assisting Patient Data Processing,” in the 10th annual conference hosted by the Canadian Institute for Military and Veteran Health Research (CIMVHR).
Google Scholar
Telus: https://www.telus.com, last accessed 2020/8/28
QHR Technologies: https://qhrtechnologies.com/, last accessed 2020/8/28
OSCAR EMR: https://oscar-emr.com/oscar/, last accessed 2020/8/28
LaFreniere, D., Zulkernine, F., Barber, D., & Martin, K. (2016, December). Using machine learning to predict hypertension from a clinical dataset. In 2016 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1-7). IEEE.
Google Scholar
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The bulletin of mathematical biophysics, 5(4), 115-133.
Article Google Scholar
Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
Article PubMed Google Scholar
Oh, K. S., & Jung, K. (2004). GPU implementation of neural networks. Pattern Recognition, 37(6), 1311-1314.
Article Google Scholar
Chellapilla, K., Puri, S., & Simard, P. (2006, October). High performance convolutional neural networks for document processing.
Google Scholar
OSCAR Canada: About OSCAR, http://oscarcanada.org/about-oscar/brief-overview, last accessed 2020/8/28.
Xiao, L., Cousins, G., Fahey, T., Dimitrov, B. D., & Hederman, L. (2012, October). Developing a rule-driven clinical decision support system with an extensive and adaptative architecture. In 2012 IEEE 14th International Conference on e-Health Networking, Applications and Services (Healthcom) (pp. 250-254). IEEE.
Google Scholar
Achour, S. L., Dojat, M., Rieux, C., Bierling, P., & Lepage, E. (2001). A UMLS-based knowledge acquisition tool for rule-based clinical decision support system development. Journal of the American Medical Informatics Association, 8(4), 351-360.
Article CAS PubMed PubMed Central Google Scholar
Kuo, K. L., & Fuh, C. S. (2011). A rule-based clinical decision model to support interpretation of multiple data in health examinations. Journal of medical systems, 35(6), 1359-1373.
Article PubMed Google Scholar
Choi, E., Bahadori, M. T., Schuetz, A., Stewart, W. F., & Sun, J. (2016, December). Doctor ai: Predicting clinical events via recurrent neural networks. In Machine Learning for Healthcare Conference (pp. 301-318).
Google Scholar
Liu, J., Zhang, Z., & Razavian, N. (2018). Deep ehr: Chronic disease prediction using medical notes. arXiv preprint arXiv:1808.04928..
Google Scholar
Shickel, B., Tighe, P. J., Bihorac, A., & Rashidi, P. (2017). Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis. IEEE journal of biomedical and health informatics, 22(5), 1589-1604.
Article PubMed PubMed Central Google Scholar
Judd, M., Zulkernine, F., Wolfrom, B., Barber, D., & Rajaram, A. (2018, September). Detecting low back pain from clinical narratives using machine learning approaches. In International Conference on Database and Expert Systems Applications (pp. 126-137). Springer, Cham.
Google Scholar
Kaczmarek, E., Salgo, A., Zafari, H., Kosowan, L., Singer, A., & Zulkernine, F. (2019, December). Diagnosing PTSD using electronic medical records from canadian primary care data. In Proceedings of the 6th International Conference on Networking, Systems and Security (pp. 23-29).
Google Scholar
Braunstein, M. L. (2015, June). Patient—Physician collaboration on FHIR (Fast Healthcare Interoperability Resources). In 2015 International Conference on Collaboration Technologies and Systems (CTS) (pp. 501-503). IEEE.
Google Scholar
Coleman, N., Halas, G., Peeler, W., Casaclang, N., Williamson, T., & Katz, A. (2015). From patient care to research: a validation study examining the factors contributing to data quality in a primary care electronic medical record database. BMC family practice, 16(1), 11.
Article PubMed PubMed Central Google Scholar
Shortliffe, E. H. (1986). Medical expert systems—knowledge tools for physicians. Western Journal of Medicine, 145(6), 830.
CAS PubMed Central PubMed Google Scholar
Miller, R. A., McNeil, M. A., Challinor, S. M., Masarie Jr, F. E., & Myers, J. D. (1986). The INTERNIST-1/quick medical REFERENCE project—Status report. Western Journal of Medicine, 145(6), 816.
CAS PubMed Central PubMed Google Scholar
Pauker, S. G., Gorry, G. A., Kassirer, J. P., & Schwartz, W. B. (1976). Towards the simulation of clinical cognition: taking a present illness by computer. The American journal of medicine, 60(7), 981-996.
Article CAS PubMed Google Scholar
MYCIN: https://web.archive.org/web/20120212093503/http://raa.ruby-lang.org/project/mycin/, last accessed 2020/8/28
Kulikowski, C. A., & Weiss, S. M. (1982). Representation of expert knowledge for consultation: the CASNET and EXPERT projects. Artificial Intelligence in medicine, 51, 21-55.
Google Scholar
Kumar, A., Zarychanski, R., Pinto, R., Cook, D. J., Marshall, J., Lacroix, J., ... & Turgeon, A. F. (2009). Critically ill patients with 2009 influenza A (H1N1) infection in Canada. Jama, 302(17), 1872-1879.
Google Scholar
Lewis, M. D., Pavlin, J. A., Mansfield, J. L., O’Brien, S., Boomsma, L. G., Elbert, Y., & Kelley, P. W. (2002). Disease outbreak detection system using syndromic data in the greater Washington DC area. American journal of preventive medicine, 23(3), 180-186.
Article PubMed Google Scholar
Guthmann, J. P., Klovstad, H., Boccia, D., Hamid, N., Pinoges, L., Nizou, J. Y., ... & Ciglenecki, I. (2006). A large outbreak of hepatitis E among a displaced population in Darfur, Sudan, 2004: the role of water treatment methods. Clinical infectious diseases, 42(12), 1685-1691.
Google Scholar
Miotto, R., Li, L., Kidd, B. A., & Dudley, J. T. (2016). Deep patient: an unsupervised representation to predict the future of patients from the electronic health records. Scientific reports, 6(1), 1-10.
Article CAS Google Scholar
Lakhani, P., & Sundaram, B. (2017). Deep learning at chest radiography: automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology, 284(2), 574-582.
Article PubMed Google Scholar
Wang, N., Cui, L., Huang, X., Xiang, Y., & Xiao, J. (2018). EasiCSDeep: A deep learning model for Cervical Spondylosis Identification using surface electromyography signal. arXiv preprint arXiv:1812.04912.
Google Scholar
Tomar, D., & Agarwal, S. (2013). A survey on Data Mining approaches for Healthcare. International Journal of Bio-Science and Bio-Technology, 5(5), 241-266.
Article Google Scholar
Ding, S., Zhu, H., Jia, W., & Su, C. (2012). A survey on feature extraction for pattern recognition. Artificial Intelligence Review, 37(3), 169-180.
Article Google Scholar
Reed, R., & MarksII, R. J. (1999). Neural smithing: supervised learning in feedforward artificial neural networks. Mit Press.
Book Google Scholar
K. Patel, “MNIST Handwritten Digits Classification using a Convolutional Neural Network,” 2020. [Online]. Available: https://towardsdatascience.com/mnist-handwritten-digits-classification-using-a-convolutional-neural-network-cnn-af5fafbc35e9.
Simard, P. Y., Steinkraus, D., & Platt, J. C. (2003, August). Best practices for convolutional neural networks applied to visual document analysis. In Icdar (Vol. 3, No. 2003).
Google Scholar
Kim, Y. (2014). Convolutional neural networks for sentence classification. arXiv preprint arXiv:1408.5882.
Google Scholar
Pham, T., Tran, T., Phung, D., & Venkatesh, S. (2017). Predicting healthcare trajectories from medical records: A deep learning approach. Journal of biomedical informatics, 69, 218-229.
Article PubMed Google Scholar
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).
Google Scholar
Chen, K., Zhou, Y., & Dai, F. (2015, October). A LSTM-based method for stock returns prediction: A case study of China stock market. In 2015 IEEE international conference on big data (big data) (pp. 2823-2824). IEEE.
Google Scholar
Choi, E., Schuetz, A., Stewart, W. F., & Sun, J. (2017). Using recurrent neural network models for early detection of heart failure onset. Journal of the American Medical Informatics Association, 24(2), 361-370.
Article PubMed Google Scholar
Wang, Y., Neves, L., & Metze, F. (2016, March). Audio-based multimedia event detection using deep recurrent neural networks. In 2016 IEEE international conference on acoustics, speech and signal processing (ICASSP) (pp. 2742-2746). IEEE..
Google Scholar
Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555.
Google Scholar
Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11), 2673-2681.
Article Google Scholar
Kramer, M. A. (1991). Nonlinear principal component analysis using autoassociative neural networks. AIChE journal, 37(2), 233-243.
Article CAS Google Scholar
Belciug, S., & Gorunescu, F. (2014). Error-correction learning for artificial neural networks using the Bayesian paradigm. Application to automated medical diagnosis. Journal of Biomedical Informatics, 52, 329-337.
Article PubMed Google Scholar
Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.
Google Scholar
Fahlman, S. E., & Lebiere, C. (1990). The cascade-correlation learning architecture. In Advances in neural information processing systems (pp. 524-532).
Google Scholar
Russell, S. J., & Norvig, P. (2010). Artificial Intelligence-A Modern Approach, Third International Edition.
Google Scholar
Goodfellow, I., Bengio, Y., & Courville, A. (2016). 6.5 Back-Propagation and Other Differentiation Algorithms. Deep Learning, 200-220.
Google Scholar
Zhu, X. J. (2005). Semi-supervised learning literature survey. University of Wisconsin-Madison Department of Computer Sciences.
Google Scholar
Ratner, A. J., De Sa, C. M., Wu, S., Selsam, D., & Ré, C. (2016). Data programming: Creating large training sets, quickly. In Advances in neural information processing systems (pp. 3567-3575).
Google Scholar
Rosenthal, S., Farra, N., & Nakov, P. (2017, August). SemEval-2017 task 4: Sentiment analysis in Twitter. In Proceedings of the 11th international workshop on semantic evaluation (SemEval-2017) (pp. 502-518).
Google Scholar
Hu, Z., Li, X., Tu, C., Liu, Z., & Sun, M. (2018, August). Few-shot charge prediction with discriminative legal attributes. In Proceedings of the 27th International Conference on Computational Linguistics (pp. 487-498).
Google Scholar
Zhong, H., Guo, Z., Tu, C., Xiao, C., Liu, Z., & Sun, M. (2018). Legal judgment prediction via topological learning. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 3540-3549).
Google Scholar
Luo, B., Feng, Y., Xu, J., Zhang, X., & Zhao, D. (2017). Learning to predict charges for criminal cases with legal basis. arXiv preprint arXiv:1707.09168.
Google Scholar
He, H., Ganjam, K., Jain, N., Lundin, J., White, R., & Lin, J. (2017, September). An insight extraction system on biomedical literature with deep neural networks. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 2691-2701).
Google Scholar
Fahad, A., Alshatri, N., Tari, Z., Alamri, A., Khalil, I., Zomaya, A. Y., ... & Bouras, A. (2014). A survey of clustering algorithms for big data: Taxonomy and empirical analysis. IEEE transactions on emerging topics in computing, 2(3), 267-279.
Google Scholar
Choi, E., Bahadori, M. T., Searles, E., Coffey, C., Thompson, M., Bost, J., ... & Sun, J. (2016, August). Multi-layer representation learning for medical concepts. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1495-1504).
Google Scholar
Pham, T., Tran, T., Phung, D., & Venkatesh, S. (2016, April). Deepcare: A deep dynamic memory model for predictive medicine. In Pacific-Asia Conference on Knowledge Discovery and Data Mining (pp. 30-41). Springer, Cham.
Google Scholar
Wickramasinghe, N. (2017). Deepr: a convolutional net for medical records. IEEE J Biomed Health Inform.
Google Scholar
Lv, X., Guan, Y., Yang, J., & Wu, J. (2016). Clinical relation extraction with deep learning. International Journal of Hybrid Information Technology, 9(7), 237-248.
Article Google Scholar
Mallya, S., Overhage, M., Srivastava, N., Arai, T., & Erdman, C. (2019). Effectiveness of lstms in predicting congestive heart failure onset. arXiv preprint arXiv:1902.02443.
Google Scholar
Nie, L., Wang, M., Zhang, L., Yan, S., Zhang, B., & Chua, T. S. (2015). Disease inference from health-related questions via sparse deep learning. IEEE Transactions on knowledge and Data Engineering, 27(8), 2107-2119.
Article Google Scholar
Nemati, S., Ghassemi, M. M., & Clifford, G. D. (2016, August). Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach. In 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 2978-2981). IEEE.
Google Scholar
Choi, E., Bahadori, M. T., Sun, J., Kulas, J., Schuetz, A., & Stewart, W. (2016). Retain: An interpretable predictive model for healthcare using reverse time attention mechanism. In Advances in Neural Information Processing Systems (pp. 3504-3512).
Google Scholar
Ong, B. T., Sugiura, K., & Zettsu, K. (2016). Dynamically pre-trained deep recurrent neural networks using environmental monitoring data for predicting PM 2.5. Neural Computing and Applications, 27(6), 1553-1566.
Article PubMed Google Scholar
Che, Z., Purushotham, S., Khemani, R., & Liu, Y. (2015). Distilling knowledge from deep networks with applications to healthcare domain. arXiv preprint arXiv:1512.03542.
Google Scholar
Jagannatha, A. N., & Yu, H. (2016, June). Bidirectional RNN for medical event detection in electronic health records. In Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting (Vol. 2016, p. 473). NIH Public Access.
Google Scholar
Bhatt, U., Davis, B., & Moura, J. M. (2019). Diagnostic Model Explanations: A Medical Narrative. In AAAI Spring Symposium: Interpretable AI for Well-being.
Google Scholar
Kinjo, Y., Sakuma, Y., Kobayashi, T., Sugimoto, C., & Kohno, R. (2019, May). Patient Stress Estimation for Using Deep Learning with RRI Data Sensed by WBAN. In 2019 13th International Symposium on Medical Information and Communication Technology (ISMICT) (pp. 1-4). IEEE.
Google Scholar
Hu, Y., Chen, F., Cai, Y., & Yuan, Y. A Random Under-sampled Deep Architecture with Medical Event Embedding: Highly Imbalanced Rare Disease Classification with EHR Data. Network, 20(21), 22.
Google Scholar
Zhao, L., Chen, J., Chen, F., Wang, W., Lu, C. T., & Ramakrishnan, N. (2015, November). Simnest: Social media nested epidemic simulation via online semi-supervised deep learning. In 2015 IEEE International Conference on Data Mining (pp. 639-648). IEEE.
Google Scholar
Banerjee, I., Li, K., Seneviratne, M., Ferrari, M., Seto, T., Brooks, J. D., ... & Hernandez-Boussard, T. (2019). Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment. JAMIA open, 2(1), 150-159.
Google Scholar
Beaulieu-Jones, B. K., Orzechowski, P., & Moore, J. H. (2018, January). Mapping patient trajectories using longitudinal extraction and deep learning in the MIMIC-III Critical Care Database. In PSB (pp. 123-132).
Google Scholar
World Health Organization. (2000). World Health Organization Collaborating Centre for Drug Statistics Methodology: Guidelines for ATC Classification and DDD Assignment. Oslo, Norway: WHO.
Google Scholar
Fu, R., Zhang, Z., & Li, L. (2016, November). Using LSTM and GRU neural network methods for traffic flow prediction. In 2016 31st Youth Academic Annual Conference of Chinese Association of Automation (YAC) (pp. 324-328). IEEE.
Google Scholar
Bradley, A. P. (1997). The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern recognition, 30(7), 1145-1159.
Article Google Scholar
Google, “Data preprocessing for machine learning.” [Online]. Available: https://cloud.google.com/solutions/machinelearning/data-preprocessing-for-ml-with-tf-transform-pt1. [Accessed: 22-Feb-2020]
Brownlee, J. (2017). Deep Learning for Natural Language Processing: Develop Deep Learning Models for your Natural Language Problems. Machine Learning Mastery.
Google Scholar
Manning, C., & Schutze, H. (1999). Foundations of statistical natural language processing. MIT press.
Google Scholar
Wang, Y., Wang, L., Rastegar-Mojarad, M., Moon, S., Shen, F., Afzal, N., ... & Liu, H. (2018). Clinical information extraction applications: a literature review. Journal of biomedical informatics, 77, 34-49.
Google Scholar
ATC codes, “World Health Organization Collaborating Centre for Drug Statistics Methodology.” [Online]. Available: https://www.whocc.no/atc_ddd_index/ .
Sethy, A., & Ramabhadran, B. (2008). Bag-of-word normalized n-gram models. In Ninth Annual Conference of the International Speech Communication Association.
Book Google Scholar
Di Nunzio, G. M., & Vezzani, F. (2018). A Linguistic Failure Analysis of Classification of Medical Publications: A Study on Stemming vs Lemmatization. In CLiC-it.
Google Scholar
Wang, Y., Wang, L., Rastegar-Mojarad, M., Moon, S., Shen, F., Afzal, N., ... & Liu, H. (2018). Clinical information extraction applications: a literature review. Journal of biomedical informatics, 77, 34-49.
Google Scholar
Savova, G. K., Masanz, J. J., Ogren, P. V., Zheng, J., Sohn, S., Kipper-Schuler, K. C., & Chute, C. G. (2010). Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17(5), 507-513.
Article PubMed PubMed Central Google Scholar
Aronson, A. R., & Lang, F. M. (2010). An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association, 17(3), 229-236.
Article PubMed PubMed Central Google Scholar
Ferrucci, D., & Lally, A. (2004). UIMA: an architectural approach to unstructured information processing in the corporate research environment. Natural Language Engineering, 1-26.
Google Scholar
Baldridge, J. (2005). The opennlp project. URL: http://opennlp.apache.org/index.html,(accessed 2 February 2012), 1.
Pennington, J., Socher, R. and Manning, C.D., (2014). Glove: Global vectors for word representation. In Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP) (pp. 1532-1543).
Google Scholar
Bodenreider, O. (2004). The unified medical language system (UMLS): integrating biomedical terminology. Nucleic acids research, 32(suppl_1), D267-D270.
Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed representations of words and phrases and their compositionality. In Advances in neural information processing systems (pp. 3111-3119).
Google Scholar
Graves, A., & Schmidhuber, J. (2005). Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural networks, 18(5-6), 602-610.
Article PubMed Google Scholar
Lev, G., Klein, B., & Wolf, L. (2015, June). In defense of word embedding for generic text representation. In International Conference on Applications of Natural Language to Information Systems (pp. 35-50). Springer, Cham.
Google Scholar
Fawcett, T. (2006). An introduction to ROC analysis. Pattern recognition letters, 27(8), 861-874.
Article Google Scholar
Dernoncourt, F., Lee, J. Y., Uzuner, O., & Szolovits, P. (2017). De-identification of patient notes with recurrent neural networks. Journal of the American Medical Informatics Association, 24(3), 596-606.
Article PubMed Google Scholar
Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781.
Google Scholar
Ratner, A. J., De Sa, C. M., Wu, S., Selsam, D., & Ré, C. (2016). Data programming: Creating large training sets, quickly. In Advances in neural information processing systems (pp. 3567-3575).
Google Scholar
Wang, Y., Sohn, S., Liu, S., Shen, F., Wang, L., Atkinson, E. J., ... & Liu, H. (2019). A clinical text classification paradigm using weak supervision and deep representation. BMC medical informatics and decision making, 19(1), 1.
Google Scholar
Fries, J., Wu, S., Ratner, A., & Ré, C. (2017). Swellshark: A generative model for biomedical named entity recognition without labeled data. arXiv preprint arXiv:1704.06360.
Google Scholar
Hammar, K., Jaradat, S., Dokoohaki, N., & Matskin, M. (2018, December). Deep text mining of instagram data without strong supervision. In 2018 IEEE/WIC/ACM International Conference on Web Intelligence (WI) (pp. 158-165). IEEE.
Google Scholar
Ratner, A., Bach, S. H., Ehrenberg, H., Fries, J., Wu, S., & Ré, C. (2017, November). Snorkel: Rapid training data creation with weak supervision. In Proceedings of the VLDB Endowment. International Conference on Very Large Data Bases (Vol. 11, No. 3, p. 269). NIH Public Access.
Google Scholar
Bahdanau, D., Cho, K. and Bengio, Y., (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.
Google Scholar
Leaman, R., & Lu, Z. (2016). TaggerOne: joint named entity recognition and normalization with semi-Markov Models. Bioinformatics, 32(18), 2839-2846.
Article CAS PubMed PubMed Central Google Scholar
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. (2016, June). Hierarchical attention networks for document classification. In Proceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies (pp. 1480-1489).
Google Scholar
Gao, S., Young, M. T., Qiu, J. X., Yoon, H. J., Christian, J. B., Fearn, P. A., ... & Ramanthan, A. (2018). Hierarchical attention networks for information extraction from cancer pathology reports. Journal of the American Medical Informatics Association, 25(3), 321-330.
Google Scholar
Mullenbach, J., Wiegreffe, S., Duke, J., Sun, J., & Eisenstein, J. (2018). Explainable prediction of medical codes from clinical text. arXiv preprint arXiv:1802.05695.
Google Scholar
Baumel, T., Nassour-Kassis, J., Cohen, R., Elhadad, M., & Elhadad, N. (2017). Multi-label classification of patient notes a case study on ICD code assignment. arXiv preprint arXiv:1709.09587.
Google Scholar
Honnibal, M., & Montani, I. (2017). spacy 2: Natural language understanding with bloom embeddings, convolutional neural networks and incremental parsing. To appear, 7(1).
Google Scholar
Zhang, J., Kowsari, K., Harrison, J. H., Lobo, J. M., & Barnes, L. E. (2018). Patient2vec: A personalized interpretable deep representation of the longitudinal electronic health record. IEEE Access, 6, 65333-65346.
Article Google Scholar
Sousa, R. T., Pereira, L. A., Galvao Filho, A. R., & Soares, A. D. S. (2018). MedAttention: A Self-Attentive RNN to Predict Diabetes Complications with Financial Data.
Google Scholar
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
Google Scholar
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Google Scholar
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., & Bowman, S. R. (2018). Glue: A multi-task benchmark and analysis platform for natural language understanding. arXiv preprint arXiv:1804.07461..
Google Scholar
Beltagy, I., Cohan, A., & Lo, K. (2019). Scibert: Pretrained contextualized embeddings for scientific text. arXiv preprint arXiv:1903.10676.
Google Scholar
Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C. H., & Kang, J. (2020). BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics, 36(4), 1234-1240..
Article CAS PubMed Google Scholar
Huang, K., Altosaar, J., & Ranganath, R. (2019). Clinicalbert: Modeling clinical notes and predicting hospital readmission. arXiv preprint arXiv:1904.05342.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, Queen’s University, Kingston, ON, Canada
Hasan Zafari, Jason T. Lam, Mohammad Gasmallah & Farhana Zulkernine
Department of Family Medicine, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, MB, Canada
Leanne Kosowan, William Peeler & Alexander Singer

Authors

Hasan Zafari
View author publications
You can also search for this author in PubMed Google Scholar
Leanne Kosowan
View author publications
You can also search for this author in PubMed Google Scholar
Jason T. Lam
View author publications
You can also search for this author in PubMed Google Scholar
William Peeler
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Gasmallah
View author publications
You can also search for this author in PubMed Google Scholar
Farhana Zulkernine
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Singer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Farhana Zulkernine .

Editor information

Editors and Affiliations

Computing and Information Technology, The University of Bisha, Bisha, Saudi Arabia
Mourad Elloumi

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zafari, H. et al. (2021). Using Deep Learning with Canadian Primary Care Data for Disease Diagnosis. In: Elloumi, M. (eds) Deep Learning for Biomedical Data Analysis. Springer, Cham. https://doi.org/10.1007/978-3-030-71676-9_12

Download citation

DOI: https://doi.org/10.1007/978-3-030-71676-9_12
Published: 14 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-71675-2
Online ISBN: 978-3-030-71676-9
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)

Publish with us

Policies and ethics