Mobile Health pp 389-409 | Cite as

Time Series Feature Learning with Applications to Health Care



Exponential growth in mobile health devices and electronic health records has resulted in a surge of large-scale time series data, which demands effective and fast machine learning models for analysis and discovery. In this chapter, we discuss a novel framework based on deep learning which automatically performs feature learning from heterogeneous time series data. It is well-suited for healthcare applications, where available data have many sparse outputs (e.g., rare diagnoses) and exploitable structures (e.g., temporal order and relationships between labels). Furthermore, we introduce a simple yet effective knowledge-distillation approach to learn an interpretable model while achieving the prediction performance of deep models. We conduct experiments on several real-world datasets and show the empirical efficacy of our framework and the interpretability of the mimic models.


  1. 1.
    Ando, R.K., Zhang, T.: Learning on graph with Laplacian regularization. NIPS (2007)Google Scholar
  2. 2.
    Ba, J., Caruana, R.: Do deep nets really need to be deep? In: Advances in Neural Information Processing Systems, pp. 2654–2662 (2014)Google Scholar
  3. 3.
    Bahadori, M.T., Yu, Q.R., Liu, Y.: Fast multivariate spatio-temporal analysis via low rank tensor learning. In: NIPS (2014)Google Scholar
  4. 4.
    Bastien, F., Lamblin, P., Pascanu, R., Bergstra, J., Goodfellow, I.J., Bergeron, A., Bouchard, N., Bengio, Y.: Theano: new features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop (2012)Google Scholar
  5. 5.
    Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell. (2013)Google Scholar
  6. 6.
    Bonner, G.: Decision making for health care professionals: use of decision trees within the community mental health setting. Journal of Advanced Nursing 35(3), 349–356 (2001)CrossRefGoogle Scholar
  7. 7.
    Bucilu\(\check{\mathrm{a}}\), C., Caruana, R., Niculescu-Mizil, A.: Model compression. In: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 535–541. ACM (2006)Google Scholar
  8. 8.
    Chollet, F.: Keras: Theano-based deep learning library. Code: Documentation:
  9. 9.
    Dahl, G., Yu, D., Deng, L., Acero, A.: Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio, Speech, Language Process (2012)Google Scholar
  10. 10.
    Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: A large-scale hierarchical image database. In: CVPR (2009)Google Scholar
  11. 11.
    Denil, M., Shakibi, B., Dinh, L., Ranzato, M., de Freitas, N.: Predicting parameters in deep learning. In: NIPS (2013)Google Scholar
  12. 12.
    Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Dept. IRO, Université de Montréal, Tech. Rep 4323 (2009)Google Scholar
  13. 13.
    Fan, C.Y., Chang, P.C., Lin, J.J., Hsieh, J.: A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Applied Soft Computing 11(1), 632–644 (2011)CrossRefGoogle Scholar
  14. 14.
    Goldberger, A., Amaral, L.N., Glass, L., Hausdorff, J., Ivanov, P., Mark, R., Mietus, J., Moody, G., Peng, C., Stanley, H.: Physiobank, physiotoolkit, and physionet: Components of a new research resource for complex physiologic signals. Circulation (2000)Google Scholar
  15. 15.
    Graves, A., Jaitly, N.: Towards end-to-end speech recognition with recurrent neural networks. In: Proceedings of the 31st International Conference on Machine Learning (ICML-14), pp. 1764–1772 (2014)Google Scholar
  16. 16.
    Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531 (2015)Google Scholar
  17. 17.
    Ho, J.C., Ghosh, J., Sun, J.: Marble: high-throughput phenotyping from electronic health records via sparse nonnegative tensor factorization. In: KDD (2014)Google Scholar
  18. 18.
    Kale, D., Che, Z., Liu, Y., Wetzel, R.: Computational discovery of physiomes in critically ill children using deep learning. In: DMMI Workshop, AMIA, vol. 2014Google Scholar
  19. 19.
    Karpathy, A., Fei-Fei, L.: Deep visual-semantic alignments for generating image descriptions. In: CVPR (2015)CrossRefGoogle Scholar
  20. 20.
    Kerr, K.F., Bansal, A., Pepe, M.S.: Further insight into the incremental value of new markers: the interpretation of performance measures and the importance of clinical context. American journal of epidemiology p. kws210 (2012)Google Scholar
  21. 21.
    Khemani, R.G., Conti, D., Alonzo, T.A., Bart III, R.D., Newth, C.J.: Effect of tidal volume in children with acute hypoxemic respiratory failure. Intensive care medicine 35(8), 1428–1437 (2009)CrossRefGoogle Scholar
  22. 22.
    Lasko, T.A., Denny, J., Levy, M.: Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PLoS ONE (2013)Google Scholar
  23. 23.
    Marlin, B., Kale, D., Khemani, R., Wetzel, R.: Unsupervised pattern discovery in electronic health care data using probabilistic clustering models. In: IHI (2012)CrossRefGoogle Scholar
  24. 24.
    Mikolov, T., Deoras, A., Kombrink, S., Burget, L., Cernocký J.: Empirical evaluation and combination of advanced language modeling techniques. In: INTERSPEECH (2011)Google Scholar
  25. 25.
    Organization, W.H.: International statistical classification of diseases and related health problems (2004)Google Scholar
  26. 26.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. JMLR (2011)Google Scholar
  27. 27.
    Peleg, M., Tu, S., Bury, J., Ciccarese, P., Fox, J., Greenes, R.A., Hall, R., Johnson, P.D., Jones, N., Kumar, A., et al.: Comparing computer-interpretable guideline models: a case-study approach. Journal of the American Medical Informatics Association 10(1), 52–68 (2003)CrossRefGoogle Scholar
  28. 28.
    Quinlan, J.R.: Induction of decision trees. Machine learning 1(1), 81–106 (1986)Google Scholar
  29. 29.
    Schulam, P., Wigley, F., Saria, S.: Clustering longitudinal clinical marker trajectories from electronic health data: Applications to phenotyping and endotype discovery (2015)Google Scholar
  30. 30.
    Silva, I., Moody, G., Scott, D.J., Celi, L.A., Mark, R.G.: Predicting in-hospital mortality of ICU patients: The physionet/computing in cardiology challenge 2012. Computing in cardiology (2012)Google Scholar
  31. 31.
    Socher, R., Huang, E., Pennin, J., Manning, C.D., Ng, A.Y.: Dynamic pooling and unfolding recursive autoencoders for paraphrase detection. In: NIPS (2011)Google Scholar
  32. 32.
    Srivastava, N., Salakhutdinov, R.R.: Discriminative transfer learning with tree-based priors. In: NIPS, pp. 2094–2102 (2013)Google Scholar
  33. 33.
    Szegedy, C., Zaremba, W., Sutskever, I., Bruna, J., Erhan, D., Goodfellow, I., Fergus, R.: Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013)Google Scholar
  34. 34.
    Torralba, A., Fergus, R., Freeman, W.T.: 80 million tiny images: A large data set for nonparametric object and scene recognition. PAMI (2008)Google Scholar
  35. 35.
    Turian, J., Ratinov, L., Bengio, Y.: Word representations: A simple and general method for semi-supervised learning. In: ACL (2010)Google Scholar
  36. 36.
    Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A.: Extracting and composing robust features with denoising autoencoders. In: ICML (2008)CrossRefGoogle Scholar
  37. 37.
    Weinberger, K.Q., Sha, F., Zhu, Q., Saul, L.K.: Graph Laplacian regularization for large-scale semidefinite programming. In: NIPS (2006)Google Scholar
  38. 38.
    Wu, G., Kim, M., Wang, Q., Gao, Y., Liao, S., Shen, D.: Unsupervised deep feature learning for deformable registration of mr brain images. In: MICCAI (2013)Google Scholar
  39. 39.
    Wu, R., Yan, S., Shan, Y., Dang, Q., Sun, G.: Deep image: Scaling up image recognition. arXiv:1501.02876 (2015)Google Scholar
  40. 40.
    Xiang, T., Ray, D., Lohrenz, T., Dayan, P., Montague, P.R.: Computational phenotyping of two-person interactions reveals differential neural response to depth-of-thought. PLoS Comput. Biol. (2012)Google Scholar
  41. 41.
    Yao, Z., Liu, P., Lei, L., Yin, J.: R-c4. 5 decision tree model and its applications to health care dataset. In: Services Systems and Services Management, 2005. Proceedings of ICSSSM’05. 2005 International Conference on, vol. 2, pp. 1099–1103. IEEE (2005)Google Scholar
  42. 42.
    Zeiler, M.D., Fergus, R.: Visualizing and understanding convolutional networks. In: Computer Vision–ECCV 2014, pp. 818–833. Springer (2014)Google Scholar
  43. 43.
    Zhang, T., Popescul, A., Dom, B.: Linear prediction models with graph regularization for web-page categorization. In: KDD (2006)CrossRefGoogle Scholar
  44. 44.
    Zhou, G., Sohn, K., Lee, H.: Online incremental feature learning with denoising autoencoders. In: AISTATS (2012)Google Scholar
  45. 45.
    Zhou, J., Wang, F., Hu, J., Ye, J.: From micro to macro: Data driven phenotyping by densification of longitudinal electronic medical records. In: KDD (2014)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  1. 1.Department of Computer ScienceUniversity of Southern CaliforniaLos AngelesUSA
  2. 2.Children’s Hospital Los AngelesLos AngelesUSA

Personalised recommendations