Detecting inappropriate access to electronic health records using collaborative filtering
Many healthcare facilities enforce security on their electronic health records (EHRs) through a corrective mechanism: some staff nominally have almost unrestricted access to the records, but there is a strict ex post facto audit process for inappropriate accesses, i.e., accesses that violate the facility’s security and privacy policies. This process is inefficient, as each suspicious access has to be reviewed by a security expert, and is purely retrospective, as it occurs after damage may have been incurred. This motivates automated approaches based on machine learning using historical data. Previous attempts at such a system have successfully applied supervised learning models to this end, such as SVMs and logistic regression. While providing benefits over manual auditing, these approaches ignore the identity of the users and patients involved in a record access. Therefore, they cannot exploit the fact that a patient whose record was previously involved in a violation has an increased risk of being involved in a future violation. Motivated by this, in this paper, we propose a collaborative filtering inspired approach to predicting inappropriate accesses. Our solution integrates both explicit and latent features for staff and patients, the latter acting as a personalized “fingerprint” based on historical access patterns. The proposed method, when applied to real EHR access data from two tertiary hospitals and a file-access dataset from Amazon, shows not only significantly improved performance compared to existing methods, but also provides insights as to what indicates an inappropriate access.
KeywordsAccess violation Collaborative filtering Electronic health records Privacy breach detection
- Chen, Y., & Malin, B. (2011). Detection of anomalous insiders in collaborative environments via relational analysis of access logs. In Proceedings of the first ACM conference on data and application security and privacy (pp. 63–74). New York: ACM. Google Scholar
- Guardian, T. (2010). Department of health & human services, breaches affecting 500 or more individuals. Available online: http://www.guardian.co.uk/commentisfree/henryporter/2010/mar/02/nhs-spine-database-opting-out.
- Hofmann, T., Puzicha, J., & Jordan, M. I. (1999). Learning from dyadic data. In NIPS’99 (pp. 466–472). Google Scholar
- Kaushik, R., & Ramamurthy, R. (2011). Whodunit: an auditing tool for detecting data breaches. Proceedings of the VLDB Endowment, 4(12), 1410–1413. Google Scholar
- Kim, J., Grillo, J., Boxwala, A., Jiang, X., Mandelbaum, R., Patel, B., Mikels, D., Vinterbo, S., & Ohno-Machado, L. (2011). Anomaly and signature filtering improve classifier performance for detection of suspicious access to EHRs. In Proceedings of AMIA Annual Symposium (Vol. 2011, pp. 723–731). Google Scholar
- Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced training sets: one-sided selection. In Proceedings of the fourteenth international conference on machine learning (pp. 179–186). San Mateo: Morgan Kaufmann. Google Scholar
- Menon, A. K., Chitrapura, K. P., Garg, S., Agarwal, D., & Kota, N. (2011). Response prediction using collaborative filtering with hierarchies and side-information. In KDD’11 (pp. 141–149). New York: ACM. Google Scholar
- Office of Technology Assessment, United States Congress (1986) Federal government information technology: electronic record systems and individual privacy, ota-cit-296. Google Scholar
- Ornstein, C. (2008). Fawcett’s cancer file breached. Available online: http://articles.latimes.com/2008/apr/03/local/me-farrah3.
- Porter, H. (2010). Opting out of nhs spine. Available online: http://www.hhs.gov/ocr/privacy/hipaa/administrative/breachnotificationrule/breachtool.html.
- Thai-Nghe, N., Drumond, L., Horváth, T., Nanopoulos, A., & Schmidt-Thieme, L. (2011). Matrix and tensor factorization for predicting student performance. In CSEDU (Vol. 1, pp. 69–78). Google Scholar
- Yang, S. H., Long, B., Smola, A., Sadagopan, N., Zheng, Z., & Zha, H. (2011). Like like alike: joint friendship and interest propagation in social networks. In WWW’11 (pp. 537–546). Google Scholar
- Zhang, W., Gunter, C., Liebovitz, D., Tian, J., & Malin, B. (2011). Role prediction using electronic medical record system audits. In Proceedings of AMIA annual symposium (pp. 858–867). Google Scholar