Skip to main content
Log in

Constrained elastic net based knowledge transfer for healthcare information exchange

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Abstract

Transfer learning methods have been successfully applied in solving a wide range of real-world problems. However, there is almost no attempt of effectively using these methods in healthcare applications. In the healthcare domain, it becomes extremely critical to solve the “when to transfer” issue of transfer learning. In highly divergent source and target domains, transfer learning can lead to negative transfer. Most of the existing works in transfer learning are primarily focused on selecting useful information from the source to improve the performance of the target task, but whether the transfer learning can help and when the transfer learning should be applied in the target task are still some of the impending challenges. In this paper, we address this issue of “when to transfer” by proposing a sparse feature selection model based on the constrained elastic net penalty. As a case study of the proposed model, we demonstrate the performance using the diabetes electronic health records (EHRs) which contain patient records from all fifty states in the United States. Our approach can choose relevant features to transfer knowledge from the source to the target tasks. The proposed model can measure the differences between multivariate data distributions conditional on the predicted model, and based on this measurement we can avoid unsuccessful transfer. We successfully transfer the knowledge across different states to improve the diagnosis of diabetes in a certain state with insufficient records to build an individualized predictive model with the aid of information from other states.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Notes

  1. Although in Zhou et al. (2012) it has been named as multi-task Lasso, both \(L_1\)-norm and \(L_2\)-norm penalties are used in the optimization formulation.

References

  • Arnold A, Nallapati R, Cohen WW (2007) A comparative study of methods for transductive transfer learning. In: Seventh IEEE international conference on data mining workshops, 2007. ICDM Workshops 2007, p 77–82

  • Blitzer J, Dredze M, Pereira F (2007) Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification. ACL 7:440–447

    Google Scholar 

  • Caruana R (1997) Multitask learning. Mach Learn 28(1):41–75

    Article  MathSciNet  Google Scholar 

  • Dai W, Yang Q, Xue G, Yu Y (2007) Boosting for transfer learning. In: ICML’07: Proceedings of the 24th international conference on Machine learning, p 193–200

  • Dai W, Yang Q, Xue GR, Yu Y (2008) Self-taught clustering. In: Proceedings of the 25th international conference on machine learning, ACM, p 200–207

  • Donoho DL, Johnstone JM (1994) Ideal spatial adaptation by wavelet shrinkage. Biometrika 81(3):425–455

    Article  MATH  MathSciNet  Google Scholar 

  • Evgeniou A, Pontil M (2007) Multi-task feature learning. In: Proceedings of the 2006 conference on advances in neural information processing systems, vol. 19. The MIT Press, Cambridge, p 41

  • Evgeniou T, Pontil M (2004) Regularized multi-task learning. In: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, p 109–117

  • Farhadi A, Forsyth D, White R (2007) Transfer learning in sign language. In: IEEE Conference on computer vision and pattern recognition, CVPR’07, IEEE, p 1–8

  • Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. J Stat Softw 33(1):1–22

  • Fung GPC, Yu JX, Lu H, Yu PS (2006) Text classification without negative examples revisit. IEEE Trans Knowl Data Eng 18(1):6–20

    Article  Google Scholar 

  • Hastie T, Tibshirani R, Friedman JJH (2001) The elements of statistical learning. Springer, New York

    Book  MATH  Google Scholar 

  • Liu J, Ji S, Ye J (2009) Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence. AUAI Press, Corvallis, p 339–348

  • Mihalkova L, Mooney RJ (2008) Transfer learning by mapping with minimal target data. In: Proceedings of the AAAI-08 workshop on transfer learning for complex tasks

  • Pan J (2010) Feature-based transfer learning with real-world applications. Ph.D. thesis, The Hong Kong University of Science and Technology

  • Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE Trans Knowl Data Eng 22(10):1345–1359

    Article  Google Scholar 

  • Pan SJ, Zheng VW, Yang Q, Hu DH (2008) Transfer learning for wifi-based indoor localization. In: Association for the advancement of artificial intelligence (AAAI) workshop, p 6

  • Practice Fusion Diabetes Classification: Identify patients diagnosed with Type 2 Diabetes (2012). https://www.kaggle.com/c/pf2012-diabetes

  • Raina R, Battle A, Lee H, Packer B, Ng AY (2007) Self-taught learning: transfer learning from unlabeled data. In: Proceedings of the 24th international conference on Machine learning, ACM, p 759–766

  • Rosenstein MT, Marx Z, Kaelbling LP, Dietterich TG (2005) To transfer or not to transfer. In: NIPS 2005 workshop on inductive transfer: 10 years later, vol. 2, p 7

  • Rückert U, Kramer S (2008) Machine learning and knowledge discovery in databases., Kernel-based inductive transferSpringer, Heidelberg, pp 220–233

    Book  Google Scholar 

  • Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc B 58(1):267–288

  • Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494

    Article  MATH  MathSciNet  Google Scholar 

  • Ye J, Liu J (2012) Sparse methods for biomedical data. ACM SIGKDD Explor Newslett 14(1):4–15

    Article  Google Scholar 

  • Zhou J, Chen J, Ye J (2012) Malsar: multi-task learning via structural regularization. Arizona State University, Phoenix

    Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc B 67(2):301–320

Download references

Acknowledgments

This work was supported in part by NSF Grants IIS-1242304, IIS-1231742 and NIH Grant R21CA175974.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chandan K. Reddy.

Additional information

Responsible editors: Fei Wang, Gregor Stiglic, Ian Davidson and Zoran Obradovic.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, Y., Vinzamuri, B. & Reddy, C.K. Constrained elastic net based knowledge transfer for healthcare information exchange. Data Min Knowl Disc 29, 1094–1112 (2015). https://doi.org/10.1007/s10618-014-0389-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10618-014-0389-3

Keywords

Navigation