Towards Assigning Diagnosis Codes Using Medication History

Sagi, Tomer; Hansen, Emil Riis; Hose, Katja; Lip, Gregory Y. H.; Bjerregaard Larsen, Torben; Skjøth, Flemming

doi:10.1007/978-3-030-59137-3_19

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12299))

Included in the following conference series:

International Conference on Artificial Intelligence in Medicine

2001 Accesses
2 Citations

Abstract

Prior studies have manually assessed diagnosis codes and found them to be erroneous/incomplete between 4–30% of the time. Previous methods to validate and suggest missing codes from medical notes are limited in the absence of these, or when the notes are not written in English. In this work, we propose using patients’ medication data to suggest and validate diagnosis codes. Previous attempts to assign codes using medication data have focused on a single condition. We present a proof-of-concept study using MIMIC-III prescription data to train a machine-learning-based model to predict a large collection of diagnosis codes assigned on four levels of aggregation of the ICD-9 hierarchy. The model is able to correctly recall 58.2% of the ICD-9 categories and is precise in 78.3% of the cases. We evaluate the model’s performance on more detailed ICD-9 levels and examine which codes and code groups can be accurately assigned using medication data. We suggest a specialized loss function designed to utilize ICD-9’s natural hierarchical nature. It performs consistently better than the non-hierarchical state-of-the-art.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baumel, T., et al.: Multi-label classification of patient notes: case study on ICD code assignment. In: Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence, June 2018. https://www.aaai.org/ocs/index.php/WS/AAAIW18/paper/viewPaper/16881
Cerri, R., Barros, R.C., De Carvalho, A.C.: Hierarchical multi-label classification using local neural networks. J. Comput. Syst. Sci. 80(1), 39–56 (2014). https://doi.org/10.1016/j.jcss.2013.03.007
Article MathSciNet MATH Google Scholar
Cheng, X., Zhang, L., Zheng, Y.: Deep similarity learning for multimodal medical images. Comput. Meth. Biomech. Biomed. Eng. Imaging Vis. 6(3), 248–252 (2018)
Article Google Scholar
Cooke, C.R., et al.: The validity of using ICD-9 codes and pharmacy records to identify patients with chronic obstructive pulmonary disease. BMC Health Serv. Res. 11(1), 37 (2011)
Article Google Scholar
Dalsgaard, E.M., Witte, D.R., Charles, M., Jørgensen, M.E., Lauritzen, T., Sandbæk, A.: Validity of Danish register diagnoses of myocardial infarction and stroke against experts in people with screen-detected diabetes. BMC Public Health 19(1), 228 (2019). https://doi.org/10.1186/s12889-019-6549-z
Article Google Scholar
Davie, G., Langley, J., Samaranayaka, A., Wetherspoon, M.E.: Accuracy of injury coding under ICD-10-AM for New Zealand public hospital discharges. Inj. Prev. 14(5), 319–323 (2008). https://doi.org/10.1136/ip.2007.017954
Article Google Scholar
Fabris, F., Freitas, A.A., Tullet, J.M.: An extensive empirical comparison of probabilistic hierarchical classifiers in datasets of ageing-related genes. IEEE/ACM Trans. Comput. Biol. Bioinform. 13(6), 1045–1058 (2016). https://doi.org/10.1109/TCBB.2015.2505288
Article Google Scholar
Ford, E., Carroll, J.A., Smith, H.E., Scott, D., Cassell, J.A.: Extracting information from the text of electronic medical records to improve case detection: a systematic review. J. Am. Med. Inf. Assoc. 23(5), 1007–1015 (2016). https://doi.org/10.1093/jamia/ocv180
Article Google Scholar
Goldberger, A.L., et al.: Physiobank, physiotoolkit, and physionet: components of a new research resource for complex physiologic signals. Circulation 101(23), e215–e220 (2000)
Article Google Scholar
Hansen, E.R., Sagi, T., Hose, K., Lip, G.Y.H., Larsen, T.B., Skjøth, F.: MIMIC Prescriptions result files (2020). https://doi.org/10.7910/DVN/5VTBME
Hripcsak, G., et al.: Observational health data sciences and informatics (OHDSI): opportunities for observational researchers. Stud. Health Technol. Inf. 216, 574–8 (2015)
Google Scholar
Huang, J., Osorio, C., Sy, L.W.: An empirical evaluation of deep learning for ICD-9 code assignment using MIMIC-III clinical notes. Comput. Methods Programs Biomed. 177, 141–153 (2019). https://doi.org/10.1016/j.cmpb.2019.05.024
Article Google Scholar
Hung, C.Y., Chen, W.C., Lai, P.T., Lin, C.H., Lee, C.C.: Comparing deep neural network and other machine learning algorithms for stroke prediction in a large-scale population-based electronic medical claims database. In: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pp. 3110–3113. Institute of Electrical and Electronics Engineers Inc. September 2017. https://doi.org/10.1109/EMBC.2017.8037515
Johnson, A.E., et al.: MIMIC-III, a freely accessible critical care database. Sci. Data 3(1), 1–9 (2016)
Article Google Scholar
Kim, L., Kim, J.A., Kim, S.: A guide for the utilization of health insurance review and assessment service national patient samples. Epidemiol. Health 36, e2014008 (2014). https://doi.org/10.4178/epih/e2014008
Article Google Scholar
Martins, A.F.T., Astudillo, R.F.: From softmax to sparsemax: a sparse model of attention and multi-label classification. In: Balcan, M., Weinberger, K.Q. (eds.) Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, USA, 19–24 June 2016. JMLR Workshop and Conference Proceedings, vol. 48, pp. 1614–1623. JMLR.org (2016). http://proceedings.mlr.press/v48/martins16.html
Névéol, A., Dalianis, H., Velupillai, S., Savova, G., Zweigenbaum, P.: Clinical natural language processing in languages other than English: opportunities and challenges. J. Biomed. Semant 9(1), 12 (2018). https://doi.org/10.1186/s13326-018-0179-8
Article Google Scholar
Perotte, A., et al.: Diagnosis code assignment: models and evaluation metrics. J. Am. Med. Inf. Assoc. 21(2), 231–237 (2014). https://doi.org/10.1136/amiajnl-2013-002159
Article Google Scholar
Razavian, N., Marcus, J., Sontag, D.A.: Multi-task prediction of disease onsets from longitudinal laboratory tests. In: Doshi-Velez, F., Fackler, J., Kale, D.C., Wallace, B.C., Wiens, J. (eds.) Proceedings of the 1st Machine Learning in Health Care, MLHC 2016, Los Angeles, CA, USA, 19–20 August 2016. JMLR Workshop and Conference Proceedings, vol. 56, pp. 73–100. JMLR.org (2016). http://proceedings.mlr.press/v56/Razavian16.html
Schmidt, M., et al.: The Danish health care system and epidemiological research: from health care contacts to database records. Clin. Epidemiol. 11, 563–591 (2019). https://doi.org/10.2147/CLEP.S179083
Article Google Scholar
Schmidt, M., Sørensen, H.T., Pedersen, L.: Diclofenac use and cardiovascular risks: series of nationwide cohort studies. BMJ 362, k3426 (2018). https://doi.org/10.1136/bmj.k3426
Article Google Scholar
Schmidt, S.A., Vestergaard, M., Baggesen, L.M., Pedersen, L., Schønheyder, H.C., Sørensen, H.T.: Prevaccination epidemiology of herpes zoster in Denmark: quantification of occurrence and risk factors. Vaccine 35(42), 5589–5596 (2017). https://doi.org/10.1016/j.vaccine.2017.08.065
Article Google Scholar
Wang, Y., et al.: Clinical information extraction applications: a literature review. J. Biomed. Inf. 77, 34–49 (2018). https://doi.org/10.1016/j.jbi.2017.11.011
Article Google Scholar
Wehrmann, J., Cerri, R., Barros, R.: Hierarchical multi-label classification networks. In: Dy, J., Krause, A. (eds.) Proceedings of the 35th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 80, pp. 5075–5084. PMLR, Stockholmsmässan, Stockholm Sweden (2018). http://proceedings.mlr.press/v80/wehrmann18a.html
WHO: International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10). Technical report., World Health Organization, Geneva, Switzerland (2004)
Google Scholar
Wockenfuss, R., Frese, T., Herrmann, K., Claussnitzer, M., Sandholzer, H.: Three- and four-digit ICD-10 is not a reliable classification system in primary care. Scand. J. Prim. Health Care 27(3), 131–136 (2009). https://doi.org/10.1080/02813430903072215
Article Google Scholar
Xu, D., Shi, Y., Tsang, I.W., Ong, Y., Gong, C., Shen, X.: Survey on multi-output learning. IEEE Trans. Neural Netw. Learn. Syst. (Early Access), 1–21 (2019)
Google Scholar
Zhang, M.L., Zhou, Z.H.: A review on multi-label learning algorithms. IEEE Trans. Knowl. Data Eng. 26(8), 1819–1837 (2014). https://doi.org/10.1109/TKDE.2013.39
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Aalborg University, Aalborg, Denmark
Tomer Sagi, Emil Riis Hansen & Katja Hose
Aalborg Thrombosis Research Unit, Department of Clinical Medicine, Aalborg University, Aalborg, Denmark
Gregory Y. H. Lip, Torben Bjerregaard Larsen & Flemming Skjøth
Thrombosis and Drug Research Unit, Department of Research and Innovation, Aalborg University Hospital, Aalborg, Denmark
Flemming Skjøth
Thrombosis and Drug Research Unit, Department of Cardiology, Aalborg University Hospital, Aalborg, Denmark
Torben Bjerregaard Larsen
Liverpool Centre for Cardiovascular Sciences, University of Liverpool and Liverpool Heart & Chest Hospital, Liverpool, UK
Gregory Y. H. Lip

Authors

Tomer Sagi
View author publications
You can also search for this author in PubMed Google Scholar
Emil Riis Hansen
View author publications
You can also search for this author in PubMed Google Scholar
Katja Hose
View author publications
You can also search for this author in PubMed Google Scholar
Gregory Y. H. Lip
View author publications
You can also search for this author in PubMed Google Scholar
Torben Bjerregaard Larsen
View author publications
You can also search for this author in PubMed Google Scholar
Flemming Skjøth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tomer Sagi .

Editor information

Editors and Affiliations

School of Nursing, University of Minnesota, Minneapolis, MN, USA
Martin Michalowski
Ben-Gurion University of the Negev, Tonawanda, NY, USA
Robert Moskovitch

A Appendix - Omitted codes and detailed results

Table 2 details the ommitted ocdes from the diagnosis table and the reasons for omission. We omit all codes with a low number of cases. We further omit 61 codes used to describe symptoms, as these are shared by multiple causes and will, most-probably, supplant a diagnosis code following medical investigation. Injuries and foreign bodies (30 codes) are omitted as well as their treatment is usually orthopedic or surgical, rather than medicinal. We omit the codes used in ICD-9 to classify birth-age and pre-term phase for infants (14 codes) as these are more descriptive than diagnostic. Finally, we omit the E and V series of codes that are used to provide additional details for statistical reasons and which do not cause differences in medicinal treatment. We remain with 567 codes and 54,423 cases (92.4%) that contain at least one of the remaining codes. Filtering out only admissions contained in both the diagnosis and prescription tables we remain with 50,211 admissions.

Table 2. List of Omitted ICD-9 Codes and Code Groups

Full size table

Detailed results are available online [10].

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sagi, T., Hansen, E.R., Hose, K., Lip, G.Y.H., Bjerregaard Larsen, T., Skjøth, F. (2020). Towards Assigning Diagnosis Codes Using Medication History. In: Michalowski, M., Moskovitch, R. (eds) Artificial Intelligence in Medicine. AIME 2020. Lecture Notes in Computer Science(), vol 12299. Springer, Cham. https://doi.org/10.1007/978-3-030-59137-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-59137-3_19
Published: 26 September 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59136-6
Online ISBN: 978-3-030-59137-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Towards Assigning Diagnosis Codes Using Medication History

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A Appendix - Omitted codes and detailed results

A Appendix - Omitted codes and detailed results

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation