Skip to main content

Prediction of Lung Cancer from Electronic Health Records Using CNN Supported NLP

  • Chapter
  • First Online:
Computational Intelligence for Clinical Diagnosis

Part of the book series: EAI/Springer Innovations in Communication and Computing ((EAISICC))

Abstract

Lung tumors act as a severe cancer type across the globe, and it is common among both men and women. With increasing medical and healthcare facilities, the contribution to electronic health records (EHRs) has been increasing for decades. This further provides a bigger opportunity to increase the analytics via epidemiological studies that may adopt some of the intelligent informatics approaches. Among these, the artificial intelligence-supported natural language processing (NLP) is used for the automatic extraction of the data from the HER, that is, text datasets in relation to lung cancer. Further, the extraction of instances from larger datasets or larger volumes is considered to be laborious and reduces time. In this paper, we use deep learning architecture with NLP to extract and predict cancer in lungs from the input datasets. This text mining model enables automated prediction of instances from the input datasets for optimal prediction of cancer. The model is tested in a different environment to test the efficacy and robustness of DL with NLP with different sets of test datasets. The evaluation showed an improved prediction accuracy than the existing methods. The simulation shows that the model is robust for lung cancer research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Li, I., Pan, J., Goldwasser, J., Verma, N., Wong, W. P., Nuzumlalı, M. Y., … & Radev, D. (2021). Neural natural language processing for unstructured data in electronic health records: A review. arXiv preprint arXiv:2107.02975.

    Google Scholar 

  2. Patra, B. G., Sharma, M. M., Vekaria, V., Adekkanattu, P., Patterson, O. V., Glicksberg, B., et al. (2021). Extracting social determinants of health from electronic health records using natural language processing: A systematic review. Journal of the American Medical Informatics Association, 28(12), 2716–2727.

    Article  Google Scholar 

  3. Rasmy, L., Xiang, Y., Xie, Z., Tao, C., & Zhi, D. (2021). Med-BERT: Pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction. NPJ Digital Medicine, 4(1), 1–13.

    Article  Google Scholar 

  4. Yuan, Q., Cai, T., Hong, C., Du, M., Johnson, B. E., Lanuti, M., et al. (2021). Performance of a machine learning algorithm using electronic health record data to identify and estimate survival in a longitudinal cohort of patients with lung cancer. JAMA Network Open, 4(7), e2114723–e2114723.

    Article  Google Scholar 

  5. Morin, O., Vallières, M., Braunstein, S., Ginart, J. B., Upadhaya, T., Woodruff, H. C., et al. (2021). An artificial intelligence framework integrating longitudinal electronic health records with real-world data enables continuous pan-cancer prognostication. Nature Cancer, 2(7), 709–722.

    Article  Google Scholar 

  6. Kehl, K. L., Xu, W., Gusev, A., Bakouny, Z., Choueiri, T. K., Riaz, I. B., et al. (2021). Artificial intelligence-aided clinical annotation of a large multi-cancer genomic dataset. Nature Communications, 12(1), 1–9.

    Article  Google Scholar 

  7. Hao, T., Huang, Z., Liang, L., Weng, H., & Tang, B. (2021). Health natural language processing: Methodology development and applications. JMIR Medical Informatics, 9(10), e23898.

    Article  Google Scholar 

  8. Choi, Y. C., Zhang, D., & Tyczynski, J. E. (2021). Comparison between health insurance claims and electronic health records (EHRs) for metastatic non-small-cell lung cancer (NSCLC) patient characteristics and treatment patterns: A retrospective cohort study. Drugs-Real World Outcomes, 8(4), 577–587.

    Article  Google Scholar 

  9. Zeng, J., Gensheimer, M. F., Rubin, D. L., Athey, S., & Shachter, R. D. (2022). Uncovering interpretable potential confounders in electronic medical records. Nature Communications, 13(1), 1–14.

    Article  Google Scholar 

  10. Montazeri, M., Afraz, A., Farimani, R. M., & Ghasemian, F. (2021). Natural language processing systems for diagnosing and determining level of lung cancer: A systematic review. Frontiers in Health Informatics, 10(1), 68.

    Article  Google Scholar 

  11. Vaid, A., Jaladanki, S. K., Xu, J., Teng, S., Kumar, A., Lee, S., et al. (2021). Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: Machine learning approach. JMIR Medical Informatics, 9(1), e24207.

    Article  Google Scholar 

  12. Nemesure, M. D., Heinz, M. V., Huang, R., & Jacobson, N. C. (2021). Predictive modeling of depression and anxiety using electronic health records and a novel machine learning approach with artificial intelligence. Scientific Reports, 11(1), 1–9.

    Article  Google Scholar 

  13. Araujo, P., Astray, G., Ferrerio-Lage, J. A., Mejuto, J. C., Rodriguez-Suarez, J. A., & Soto, B. (2011). Multilayer perceptron neural network for flow prediction. Journal of Environmental Monitoring, 13(1), 35–41.

    Article  Google Scholar 

  14. Zheng, T., Gao, Y., Wang, F., Fan, C., Fu, X., Li, M., et al. (2019). Detection of medical text semantic similarity based on convolutional neural network. BMC Medical Informatics and Decision Making, 19(1), 1–11.

    Article  Google Scholar 

  15. Jang, B., Kim, I., & Kim, J. W. (2019). Word2vec convolutional neural networks for classification of news articles and tweets. PLoS One, 14(8), e0220976.

    Article  Google Scholar 

  16. Turner, C. A., Jacobs, A. D., Marques, C. K., Oates, J. C., Kamen, D. L., Anderson, P. E., & Obeid, J. S. (2017). Word2Vec inversion and traditional text classifiers for phenotyping lupus. BMC Medical Informatics and Decision Making, 17(1), 1–11.

    Article  Google Scholar 

  17. https://physionet.org/content/mimiciii/1.4/. Accessed 15 May 2022.

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Jabir, K., Thirumurthi Raja, A. (2023). Prediction of Lung Cancer from Electronic Health Records Using CNN Supported NLP. In: Joseph, F.J.J., Balas, V.E., Rajest, S.S., Regin, R. (eds) Computational Intelligence for Clinical Diagnosis. EAI/Springer Innovations in Communication and Computing. Springer, Cham. https://doi.org/10.1007/978-3-031-23683-9_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-23683-9_40

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-23682-2

  • Online ISBN: 978-3-031-23683-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics