Abstract
Patient similarity plays an important role in precision evidence-based medicine. While great efforts have been made to derive clinically meaningful similarity measures, how to accurately and efficiently retrieve similar patients from large scale healthcare data remains less explored. Similar patient retrieval has become increasingly important and challenging as the volume of healthcare data grows rapidly. To address the challenge, we propose a coarse-to-fine approach using binary hash codes and embedding vectors derived from an artificial neural network. Experimental results demonstrated that this approach can reduce the time for retrieval by up to over \(50.6\%\) without sacrificing the retrieval accuracy. The time reduction became more evident as the data size increased. The retrieval efficiency increased as the number of bits in binary hash codes increased. Descriptive analysis revealed distinct profiles between similar patients and the overall patient cohort.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andoni, A., Indyk, P., Laarhoven, T., Razenshteyn, I., Schmidt, L.: Practical and optimal LSH for angular distance. In: Advances in Neural Information Processing Systems, pp. 1225–1233 (2015)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Commun. ACM 18(9), 509–517 (1975)
Campillo-Gimenez, B., Jouini, W., Bayat, S., Cuggia, M.: Improving case-based reasoning systems by combining k-nearest neighbour algorithm with logistic regression in the prediction of patients’ registration on the renal transplant waiting list. PLoS ONE 8(9), e71991 (2013)
Collins, F.S., Varmus, H.: A new initiative on precision medicine. N. Engl. J. Med. 372(9), 793–795 (2015)
Dasgupta, A., Kumar, R., Sarlós, T.: Fast locality-sensitive hashing. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge discovery and Data Mining, pp. 1073–1081. ACM (2011)
Jee, K., Kim, G.H.: Potentiality of big data in the medical sector: focus on how to reshape the healthcare system. Healthc. Inform. Res. 19(2), 79–85 (2013)
Lee, J., Maslove, D.M., Dubin, J.A.: Personalized mortality prediction driven by electronic medical data and a patient similarity metric. PLoS ONE 10(5), e0127428 (2015)
Lin, K., Yang, H.F., Hsiao, J.H., Chen, C.S.: Deep learning of binary hash codes for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 27–35 (2015)
Liu, H., Wang, R., Shan, S., Chen, X.: Deep supervised hashing for fast image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2064–2072 (2016)
Ng, K., Sun, J., Hu, J., Wang, F.: Personalized predictive modeling and risk factor identification using patient similarity. In: AMIA Summits on Translational Science Proceedings 2015, p. 132 (2015)
Parimbelli, E., Marini, S., Sacchi, L., Bellazzi, R.: Patient similarity for precision medicine: a systematic review. J. Biomed. Inform. 83, 87–96 (2018)
Sharafoddini, A., Dubin, J.A., Lee, J.: Patient similarity in prediction models based on health data: a scoping review. JMIR Med. Inform. 5(1), e7 (2017)
Srinivasan, U., Arunasalam, B.: Leveraging big data analytics to reduce healthcare costs. IT Prof. 15(6), 21–28 (2013)
Sun, J., Sow, D., Hu, J., Ebadollahi, S.: Localized supervised metric learning on temporal physiological data. In: 2010 20th International Conference on Pattern Recognition, pp. 4149–4152. IEEE (2010)
Wang, F.: Adaptive semi-supervised recursive tree partitioning: the art towards large scale patient indexing in personalized healthcare. J. Biomed. Inform. 55, 41–54 (2015)
Xia, E., Du, X., Mei, J., Sun, W., Tong, S., Kang, Z., Sheng, J., Li, J., Ma, C., Dong, J., et al.: Outcome-driven clustering of acute coronary syndrome patients using multi-task neural network with attention. arXiv preprint arXiv:1903.00197 (2019)
Zhao, F., Huang, Y., Wang, L., Tan, T.: Deep semantic ranking based hashing for multi-label image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1556–1564 (2015)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Wang, K. et al. (2021). Fast Similar Patient Retrieval from Large Scale Healthcare Data: A Deep Learning-Based Binary Hashing Approach. In: Shaban-Nejad, A., Michalowski, M., Buckeridge, D.L. (eds) Explainable AI in Healthcare and Medicine. Studies in Computational Intelligence, vol 914. Springer, Cham. https://doi.org/10.1007/978-3-030-53352-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-53352-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53351-9
Online ISBN: 978-3-030-53352-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)