Abstract
Lung tumors act as a severe cancer type across the globe, and it is common among both men and women. With increasing medical and healthcare facilities, the contribution to electronic health records (EHRs) has been increasing for decades. This further provides a bigger opportunity to increase the analytics via epidemiological studies that may adopt some of the intelligent informatics approaches. Among these, the artificial intelligence-supported natural language processing (NLP) is used for the automatic extraction of the data from the HER, that is, text datasets in relation to lung cancer. Further, the extraction of instances from larger datasets or larger volumes is considered to be laborious and reduces time. In this paper, we use deep learning architecture with NLP to extract and predict cancer in lungs from the input datasets. This text mining model enables automated prediction of instances from the input datasets for optimal prediction of cancer. The model is tested in a different environment to test the efficacy and robustness of DL with NLP with different sets of test datasets. The evaluation showed an improved prediction accuracy than the existing methods. The simulation shows that the model is robust for lung cancer research.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, I., Pan, J., Goldwasser, J., Verma, N., Wong, W. P., Nuzumlalı, M. Y., … & Radev, D. (2021). Neural natural language processing for unstructured data in electronic health records: A review. arXiv preprint arXiv:2107.02975.
Patra, B. G., Sharma, M. M., Vekaria, V., Adekkanattu, P., Patterson, O. V., Glicksberg, B., et al. (2021). Extracting social determinants of health from electronic health records using natural language processing: A systematic review. Journal of the American Medical Informatics Association, 28(12), 2716–2727.
Rasmy, L., Xiang, Y., Xie, Z., Tao, C., & Zhi, D. (2021). Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digital Medicine, 4(1), 1–13.
Yuan, Q., Cai, T., Hong, C., Du, M., Johnson, B. E., Lanuti, M., et al. (2021). Performance of a machine learning algorithm using electronic health record data to identify and estimate survival in a longitudinal cohort of patients with lung cancer. JAMA Network Open, 4(7), e2114723–e2114723.
Morin, O., Vallières, M., Braunstein, S., Ginart, J. B., Upadhaya, T., Woodruff, H. C., et al. (2021). An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication. Nature Cancer, 2(7), 709–722.
Kehl, K. L., Xu, W., Gusev, A., Bakouny, Z., Choueiri, T. K., Riaz, I. B., et al. (2021). Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset. Nature Communications, 12(1), 1–9.
Hao, T., Huang, Z., Liang, L., Weng, H., & Tang, B. (2021). Health natural language processing: Methodology development and applications. JMIR Medical Informatics, 9(10), e23898.
Choi, Y. C., Zhang, D., & Tyczynski, J. E. (2021). Comparison between health insurance claims and electronic health records (EHRs) for metastatic non-small-cell lung cancer (NSCLC) patient characteristics and treatment patterns: A retrospective cohort study. Drugs-Real World Outcomes, 8(4), 577–587.
Zeng, J., Gensheimer, M. F., Rubin, D. L., Athey, S., & Shachter, R. D. (2022). Uncovering interpretable potential confounders in electronic medical records. Nature Communications, 13(1), 1–14.
Montazeri, M., Afraz, A., Farimani, R. M., & Ghasemian, F. (2021). Natural language processing systems for diagnosing and determining level of lung cancer: A systematic review. Frontiers in Health Informatics, 10(1), 68.
Vaid, A., Jaladanki, S. K., Xu, J., Teng, S., Kumar, A., Lee, S., et al. (2021). Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: Machine learning approach. JMIR Medical Informatics, 9(1), e24207.
Nemesure, M. D., Heinz, M. V., Huang, R., & Jacobson, N. C. (2021). Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence. Scientific Reports, 11(1), 1–9.
Araujo, P., Astray, G., Ferrerio-Lage, J. A., Mejuto, J. C., Rodriguez-Suarez, J. A., & Soto, B. (2011). Multilayer perceptron neural network for flow prediction. Journal of Environmental Monitoring, 13(1), 35–41.
Zheng, T., Gao, Y., Wang, F., Fan, C., Fu, X., Li, M., et al. (2019). Detection of medical text semantic similarity based on convolutional neural network. BMC Medical Informatics and Decision Making, 19(1), 1–11.
Jang, B., Kim, I., & Kim, J. W. (2019). Word2vec convolutional neural networks for classification of news articles and tweets. PLoS One, 14(8), e0220976.
Turner, C. A., Jacobs, A. D., Marques, C. K., Oates, J. C., Kamen, D. L., Anderson, P. E., & Obeid, J. S. (2017). Word2Vec inversion and traditional text classifiers for phenotyping lupus. BMC Medical Informatics and Decision Making, 17(1), 1–11.
https://physionet.org/content/mimiciii/1.4/. Accessed 15 May 2022.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Jabir, K., Thirumurthi Raja, A. (2023). Prediction of Lung Cancer from Electronic Health Records Using CNN Supported NLP. In: Joseph, F.J.J., Balas, V.E., Rajest, S.S., Regin, R. (eds) Computational Intelligence for Clinical Diagnosis. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-23683-9_40
Download citation
DOI: https://doi.org/10.1007/978-3-031-23683-9_40
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23682-2
Online ISBN: 978-3-031-23683-9
eBook Packages: EngineeringEngineering (R0)